Welcome to the CassIO project website
CassIO is the ultimate solution for seamlessly integrating Apache Cassandra® with generative artificial intelligence and other machine learning workloads. This powerful Python library simplifies the complicated process of accessing the advanced features of the Cassandra database, including vector search capabilities. With CassIO, developers can fully concentrate on designing and perfecting their AI systems without any concerns regarding the complexities of integration with Cassandra.
Installation and usage
Installing CassIO is as simple as
pip install cassio
To ensure optimal performance, it is recommended to make use of third-party frameworks such as LangChain which incorporate CassIO. The framework's architecture will dictate whether CassIO is a sub-dependency that necessitates installation beforehand or if it must be installed manually.
Example
A good example is the LangChain setup outlined here:
the LangChain framework itself does not list all of the packages
it might need and it is up to the user to pick, and install,
those that will actually be needed by their application.
In fact, you can see cassio
being explicitly listed in the
requirements file for these demo notebooks.
How to use this site
Don't just browse the website: you should clone the repository and start running the code examples yourself (notebooks, tutorials, full-fledged small applications). You'll find everything in this repo. You can also download a single notebook's code by clicking on the "Download Notebook" icon at the top of each page ().
Google Colaboratory
You can also run most of the code examples directly in Google Colaboratory ("Colab" for short) after a minimal amount of setup.
Just create your Astra DB instance and get an API Key for an LLM provider and you're good to go. We will give Colab-specific setup instructions later on.
If you want to run the examples in Colab, look for the "Open in Colab" icon at the top of the page ().
General pre-requisites
Most code examples require a Cassandra / Astra DB database. Out of convenience, in the general setup instructions, we show how to create a free Astra DB instance, but of course you can use any Cassandra installation, provided you adapt the few lines of code that connect to your database.
Vector-search Cassandra
Some of the features rely on the "Vector Search" capabilities, which are being added to Cassandra right now, but have not yet made it to Cassandra official releases.
If you want to use these, you have several options: you can make sure the Astra DB instance you create is a "Vector Database" (now in Public Preview), or you can build and run locally a version of Cassandra that implements these features from a pre-release branch.
Keep reading to find out more.
Similarly, many of the examples need access to a third-party service for LLMs and embeddings (for instance, Google's Vertex AI or OpenAI): make sure you follow the API setup to configure the necessary API Keys and other secrets for your provider of choice.
Per-framework specific setup
We cover Cassandra integrations with several ML-centric tools and frameworks: for each of them (a section of the site), there is a subdirectory with explanations and examples. In order to run locally the code examples you find there, further, framework-specific setup instructions are given at the top of the section: these mostly amount to creating a suitable Python environment with the right dependencies, and not much else.
Example
If you want to run the sample code for LangChain follow these steps:
- clone this repo;
- do the general DB setup;
- do the local DB setup if needed;
- do the API setup;
- do the LangChain-specific setup.
At this point you can fire up Jupyter notebook and start running any of the provided notebooks. When moving on to testing another framework, only the last step will be needed.
If you prefer to use Colab, instead, just create the Astra DB instance and obtain a valid Secret to an LLM provider - the online notebook will tell you what else is needed, if anything.
Trademark
Apache®, Apache Cassandra®, and the eye logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.