Project README
ShExStatements
ShExStatements allows users to generate Shape Expressions (ShEx) from simple CSV statements, CSV files, and spreadsheets. It can be used from the command line, via REST API, or through a modern web interface.
Python compatibility
Core CSV/Spreadsheet to ShEx conversion supports modern Python versions including Python 3.13.
CI runs on Python
3.12,3.13, plus3.14-dev(allowed to fail) to detect future breakages early.
Ways to use ShExStatements
ShExStatements currently supports three primary usage modes:
WASMruntime in the browser (static frontend, no backend required)Dockerruntime (React frontend + FastAPI backend)Pythonruntime (CLI and legacy Flask interface)
Quick start
1) Using Python (CLI)
Set up a virtual environment and install shexstatements:
$ python3 -m venv .venv
$ source ./.venv/bin/activate
$ pip3 install shexstatements
Run the following command with an example CSV file. The file contains an example description of a language on Wikidata. This file uses comma as a delimiter to separate the values.
$ shexstatements.sh examples/language.csv
2) Using Docker (Frontend + Backend)
Run the containerized stack:
cd docker
docker compose up
This starts:
Frontend: http://localhost:3000Backend API: http://localhost:8000Swagger/OpenAPI docs: http://localhost:8000/docs
For development mode with hot reloading:
cd docker
docker compose -f docker-compose.yml -f docker-compose.dev.yml up
Build from source
Terminal
Clone the ShExStatements repository.
$ git clone https://github.com/johnsamuelwrites/ShExStatements.git
Go to ShExStatements directory.
$ cd ShExStatements
Install modules required by ShExStatements (here: installing into a virtual environment).
$ python3 -m venv .venv
$ source ./.venv/bin/activate
$ pip3 install .
Run the following command with an example CSV file. The file contains an example description of a language on Wikidata. This file uses comma as a delimiter to separate the values.
$ ./shexstatements.sh examples/language.csv
CSV file can use delimiters like ;. Take for example, the following command works with a file using semi-colon as a delimiter.
$ ./shexstatements.sh examples/languagedelimsemicolon.csv --delim ";"
But sometimes, users may like to specify the header. In that case, they can make use of -s or --skipheader to tell the generator to skip the header (first line of CSV).
$ ./shexstatements.sh --skipheader examples/header/languageheader.csv
It is also possible to work with Spreadsheet files like .ods, .xls or .xlsx.
$ shexstatements.sh examples/language.ods
$ shexstatements.sh examples/language.xls
$ shexstatements.sh examples/language.xlsx
In all the above cases, the shape expression generated by ShExStatements will look like
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
start = @<language>
<language> {
wdt:P31 [ wd:Q34770 ] ;# instance of a language
wdt:P1705 LITERAL ;# native name
wdt:P17 .+ ;# spoken in country
wdt:P2989 .+ ;# grammatical cases
wdt:P282 .+ ;# writing system
wdt:P1098 .+ ;# speakers
wdt:P1999 .* ;# UNESCO language status
wdt:P2341 .+ ;# indigenous to
}
It’s also possible to use application profiles of the following form
Entity_name,Property,Property_label,Mand,Repeat,Value,Value_type,Annotation
and Shape expressions can be generated using the following form
$ ./shexstatements.sh -ap --skipheader examples/languageap.csv
Objectives
Easily generate shape expressions (ShEx) from CSV files and Spreadsheets
Simple syntax
Documentation and examples
A detailed documentation is available here, with example CSV files in the examples folder.
Test cases and coverage
All the test cases can be run in the following manner
$ python3 -m tests.tests
Code coverage report can also be generated by running the unit tests using the coverage tool.
$ coverage run --source=shexstatements -m unittest tests.tests
$ coverage report -m
Web Interface
Modern Web Interface (v1.0+)
ShExStatements now includes a modern, feature-rich web interface built with React and TypeScript.
Using Docker (recommended):
cd docker
docker compose up
Access the interface at http://localhost:3000
Features:
Split-pane editor with Monaco Editor (VS Code-like experience)
Syntax highlighting for ShExStatements and ShEx output
Dark mode support
File upload support (CSV, ODS, XLS, XLSX)
Multiple delimiter options (comma, pipe, semicolon)
Real-time error display
Copy output to clipboard
Runtime selector (
Auto,API,WASM)
Static GitHub Pages (WASM)
The frontend can run conversion directly in the browser using Python-on-WASM (Pyodide), so it can be deployed as a static site on GitHub Pages.
Enable GitHub Pages in repository settings (source: GitHub Actions).
Push to
mainormaster.The workflow
.github/workflows/pages.ymlbuilds and deploys the frontend withVITE_RUNTIME_MODE=wasm.
In WASM runtime:
CSV conversion to ShEx is supported in-browser.
Spreadsheet uploads (
.xlsx,.xls,.ods) are also supported in-browser.Pyodide dynamically installs Python dependencies (
shexstatements,ply, and spreadsheet libraries) in the browser runtime.
Legacy Web Interface
The original Flask-based interface is still available:
$ python3 -m venv .venv
$ source ./.venv/bin/activate
$ pip3 install .
$ ./shexstatements.sh -r
Check the URL http://127.0.0.1:5000/
API
ShExStatements provides a REST API for programmatic access.
Modern API (v1.0+)
The new FastAPI-based API provides:
OpenAPI/Swagger documentation at http://localhost:8000/docs
Async request handling
Structured JSON responses with error details
Convert endpoint:
curl -X POST http://localhost:8000/api/v1/convert \
-H "Content-Type: application/json" \
-d '{"content": "@shape|prop|value", "delimiter": "|", "output_format": "shex"}'
API documentation
Detailed API documentation (modern v1 API and legacy compatibility notes) is available here.
Deployment Modes
Standalone Python application: CLI + legacy Flask UI (
./shexstatements.sh).Docker application: React frontend + FastAPI backend (
docker compose up).Static GitHub Pages frontend: WASM runtime (no backend required for CSV-to-ShEx).
Demonstration
Online demonstrations are also available:
Conference Proceedings
Acknowledgements
Wikidata Community
Archives and Releases
Licence
All code are released under GPLv3+ licence. The associated documentation and other content are released under CC-BY-SA.