Working with External Data Sources

SnappyData relies on the Spark SQL Data Sources API to parallelly load data from a wide variety of sources. Any data source or database that supports Spark to load or save state can be accessed from within SnappyData.

There is built-in support for many data sources as well as data formats. Data can be accessed from S3, file system, HDFS, Hive, RDB, etc. And the loaders have built-in support to handle CSV, Parquet, ORC, Avro, JSON, Java/Scala Objects, etc as the data formats.

Attention

This section currently only details the advanced connectors that SnappyData introduced. Please refer to the howto section for a brief description about working with external data sources and some examples.

SnappyData 1.0.2.1 provides a utility to deploy third party connectors using the SQL 'Deploy' command. Refer Deployment of Third Party Connectors

For more information see: