Dataset Viewer Mcp Server
Overview
What is Dataset Viewer?
The ### Dataset Viewer is a powerful tool designed to facilitate the exploration and visualization of datasets. It allows users to easily navigate through large datasets, providing a user-friendly interface that enhances data accessibility and understanding. This tool is particularly useful for researchers, data scientists, and anyone interested in analyzing data without the need for extensive programming knowledge.
Features of Dataset Viewer
- User-Friendly Interface: The Dataset Viewer offers an intuitive design that simplifies the process of data exploration.
- Data Visualization: Users can visualize data in various formats, making it easier to identify trends and patterns.
- Support for Multiple Data Formats: The tool supports various data formats, allowing users to work with CSV, JSON, and more.
- Interactive Data Exploration: Users can interact with the data directly, filtering and sorting to find specific information quickly.
- Integration with Hugging Face: The Dataset Viewer is integrated with Hugging Face, providing access to a wide range of datasets for machine learning and AI projects.
How to Use Dataset Viewer
- Access the Tool: Navigate to the Dataset Viewer on GitHub or through the Hugging Face platform.
- Upload Your Dataset: You can upload your dataset in a supported format or select from existing datasets available in the tool.
- Explore the Data: Use the interactive features to filter, sort, and visualize the data as needed.
- Analyze Results: Take advantage of the visualization tools to analyze trends and insights from your dataset.
- Export Your Findings: Once you have completed your analysis, you can export the results for further use or reporting.
Frequently Asked Questions
What types of datasets can I use with Dataset Viewer?
You can use various types of datasets, including CSV, JSON, and other common formats. The tool is designed to handle large datasets efficiently.
Is there a cost associated with using Dataset Viewer?
No, the Dataset Viewer is a public tool available for free. You can access it without any subscription or payment.
Can I collaborate with others using Dataset Viewer?
Yes, the Dataset Viewer allows for collaborative features, enabling multiple users to explore and analyze datasets together.
How do I report issues or request features for Dataset Viewer?
You can report issues or request new features by visiting the GitHub repository for Dataset Viewer and submitting an issue in the Issues section.
Is there documentation available for Dataset Viewer?
Yes, comprehensive documentation is available on the GitHub repository, providing guidance on how to use the tool effectively.
Details
Dataset Viewer MCP Server
An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.
Features
Resources
- Uses
dataset://
URI scheme for accessing Hugging Face datasets - Supports dataset configurations and splits
- Provides paginated access to dataset contents
- Handles authentication for private datasets
- Supports searching and filtering dataset contents
- Provides dataset statistics and analysis
Tools
The server provides the following tools:
-
validate
- Check if a dataset exists and is accessible
- Parameters:
dataset
: Dataset identifier (e.g. 'stanfordnlp/imdb')auth_token
(optional): For private datasets
-
get_info
- Get detailed information about a dataset
- Parameters:
dataset
: Dataset identifierauth_token
(optional): For private datasets
-
get_rows
- Get paginated contents of a dataset
- Parameters:
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namepage
(optional): Page number (0-based)auth_token
(optional): For private datasets
-
get_first_rows
- Get first rows from a dataset split
- Parameters:
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split nameauth_token
(optional): For private datasets
-
get_statistics
- Get statistics about a dataset split
- Parameters:
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split nameauth_token
(optional): For private datasets
-
search_dataset
- Search for text within a dataset
- Parameters:
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namequery
: Text to search forauth_token
(optional): For private datasets
-
filter
- Filter rows using SQL-like conditions
- Parameters:
dataset
: Dataset identifierconfig
: Configuration namesplit
: Split namewhere
: SQL WHERE clause (e.g. "score > 0.5")orderby
(optional): SQL ORDER BY clausepage
(optional): Page number (0-based)auth_token
(optional): For private datasets
-
get_parquet
- Download entire dataset in Parquet format
- Parameters:
dataset
: Dataset identifierauth_token
(optional): For private datasets
Installation
Prerequisites
- Python 3.12 or higher
- uv - Fast Python package installer and resolver
Setup
- Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
- Create a virtual environment and install:
### Create virtual environment
uv venv
### Activate virtual environment
### On Unix:
source .venv/bin/activate
### On Windows:
.venv\Scripts\activate
### Install in development mode
uv add -e .
Configuration
Environment Variables
HUGGINGFACE_TOKEN
: Your Hugging Face API token for accessing private datasets
Claude Desktop Integration
Add the following to your Claude Desktop config file:
On Windows: %APPDATA%\Claude\claude_desktop_config.json
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"dataset-viewer": {
"command": "uv",
"args": [
"--directory",
"parent_to_repo/dataset-viewer",
"run",
"dataset-viewer"
]
}
}
}
License
MIT License - see LICENSE for details
Server Config
{
"mcpServers": {
"dataset-viewer": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"ghcr.io/metorial/mcp-container--privetin--dataset-viewer--dataset-viewer",
"dataset-viewer"
],
"env": {}
}
}
}