How to use Python and R with RDF Data
Training delivered by: PHI Data Interoperability Team
Health-related data stored in SPHN RDF schema and SPARQL queries provide a solid foundation for answering specific research questions. Building on top of this foundation, general purpose languages such as Python and R enable data scientists to apply data science methods to the retrieved data. In this training, we provide a short introduction on how to use Python and R to:
- Setup a connection to a SPARQL endpoint
- Run a SPARQL query and retrieve results
Building on top of these basics, we look at how to combine results from different queries, as well as how to deal with various datatypes.
This video assumes that your data is loaded into your triplestore (in our example, GraphDB), and that you are familiar with SPARQL. If you need instructions on loading the data into your triplestore, please watch our training “RDF Schema and Data Visualization” or read our user guide. If you need a reminder on SPARQL, please watch our training on “Querying Data with SPARQL”.
Training Contents (with time stamps):
- Introduction (00:00)
- Loading data from GraphDB in Python (02:38)
- Loading data from GraphDB in R (11:02)
- Combining results from different queries in Python (15:01)
- Combining results from different queries in R (25:08)
- Further information (35:00)