At Dato we’re big believers in open source software. So it’s with great pleasure that today, I announce the open source release of SFrame, our highly scalable column based data frame.
With the SFrame you can easily interact with data that is larger than the amount of RAM on your system. SFrame is a column based data frame that is compressed and disk-backed. It’s optimized for data science and machine learning. It supports strictly typed columns (int, float, str, datetime), weakly typed columns (schema free lists, dictionaries) as well as specialized types such as Image. For more on the design architecture, see the data processing architecture blog post by our Chief Architect, Yucheng Low.
Also included in this release is our SGraph layer, built on top of SFrame, for interacting with graph data. In addition we’re releasing the C++ surface area for our SDK.
We have also released it as a Python package. You can install it like any other Python package, by typing “pip install sframe” at the command line. The SFrame Python package is free for you to use and since it’s core to all of our commercial offerings, we’re committed to maintaining this package.
We look forward to hearing how you’re using SFrames!