Wikipedia
WI
Docker Hub MCP

Wikipedia

by github.com/Rudra-ravi Β· Devops

0.0 Β· 0 reviews
0 installs Β· 22 tools

A Model Context Protocol (MCP) server that retrieves information from Wikipedia.

Wikipedia MCP Server

smithery badge

A Model Context Protocol (MCP) server that retrieves information from Wikipedia to provide context to Large Language Models (LLMs). This tool helps AI assistants access factual information from Wikipedia to ground their responses in reliable sources.

Wikipedia Server MCP server

image

Overview

The Wikipedia MCP server provides real-time access to Wikipedia information through a standardized Model Context Protocol interface. This allows LLMs to retrieve accurate and up-to-date information directly from Wikipedia to enhance their responses.

Verified By

MseeP.ai Security Assessment Badge

Features

  • Search Wikipedia: Find articles matching specific queries with enhanced diagnostics
  • Retrieve Article Content: Get full article text with all information
  • Article Summaries: Get concise summaries of articles
  • Section Extraction: Retrieve specific sections from articles
  • Link Discovery: Find links within articles to related topics
  • Related Topics: Discover topics related to a specific article
  • Multi-language Support: Access Wikipedia in different languages by specifying the --language or -l argument when running the server (e.g., wikipedia-mcp --language ta for Tamil).
  • Country/Locale Support: Use intuitive country codes like --country US, --country China, or --country TW instead of language codes. Automatically maps to appropriate Wikipedia language variants.
  • Language Variant Support: Support for language variants such as Chinese traditional/simplified (e.g., zh-hans for Simplified Chinese, zh-tw for Traditional Chinese), Serbian scripts (sr-latn, sr-cyrl), and other regional variants.
  • Optional caching: Cache API responses for improved performance using --enable-cache
  • Modern MCP Transport Support: Supports stdio, http, and streamable-http (with legacy sse compatibility).
  • Optional MCP Transport Auth: Secure network transports with --auth-mode static or --auth-mode jwt.
  • Google ADK Compatibility: Fully compatible with Google ADK agents and other AI frameworks that use strict function calling schemas

Installation

Using pipx (Recommended for Claude Desktop)

The best way to install for Claude Desktop usage is with pipx, which installs the command globally:

# Install pipx if you don't have it
pip install pipx
pipx ensurepath

# Install the Wikipedia MCP server
pipx install wikipedia-mcp

This ensures the wikipedia-mcp command is available in Claude Desktop's PATH.

Installing via Smithery

To install wikipedia-mcp for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @Rudra-ravi/wikipedia-mcp --client claude

From PyPI (Alternative)

You can also install directly from PyPI:

pip install wikipedia-mcp

Note: If you use this method and encounter connection issues with Claude Desktop, you may need to use the full path to the command in your configuration. See the Configuration section for details.

Using a virtual environment

# Create a virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Install the package
pip install git+https://github.com/rudra-ravi/wikipedia-mcp.git

From source

# Clone the repository
git clone https://github.com/rudra-ravi/wikipedia-mcp.git
cd wikipedia-mcp

# Create a virtual environment
python3 -m venv wikipedia-mcp-env
source wikipedia-mcp-env/bin/activate

# Install in development mode
pip install -e .

Usage

Running the server

# If installed with pipx
wikipedia-mcp

# If installed in a virtual environment
source venv/bin/activate
wikipedia-mcp

# Specify transport protocol (default: stdio)
wikipedia-mcp --transport stdio  # For Claude Desktop
wikipedia-mcp --transport http --host 0.0.0.0 --port 8080 --path /mcp
wikipedia-mcp --transport streamable-http --host 0.0.0.0 --port 8080 --path /mcp
wikipedia-mcp --transport sse    # Legacy compatibility transport

# Specify language (default: en for English)
wikipedia-mcp --language ja  # Example for Japanese
wikipedia-mcp --language zh-hans  # Example for Simplified Chinese
wikipedia-mcp --language zh-tw    # Example for Traditional Chinese (Taiwan)
wikipedia-mcp --language sr-latn  # Example for Serbian Latin script

# Specify country/locale (alternative to language codes)
wikipedia-mcp --country US        # English (United States)
wikipedia-mcp --country China     # Chinese Simplified
wikipedia-mcp --country Taiwan    # Chinese Traditional (Taiwan)  
wikipedia-mcp --country Japan     # Japanese
wikipedia-mcp --country Germany   # German
wikipedia-mcp --country france    # French (case insensitive)

# List all supported countries
wikipedia-mcp --list-countries

# Optional: Specify host/port/path for network transport (use 0.0.0.0 for containers)
wikipedia-mcp --transport http --host 0.0.0.0 --port 8080 --path /mcp

# Optional: Enable caching
wikipedia-mcp --enable-cache

# Optional: Use Personal Access Token to avoid rate limiting (403 errors)
wikipedia-mcp --access-token your_wikipedia_token_here

# Or set via environment variable
export WIKIPEDIA_ACCESS_TOKEN=your_wikipedia_token_here
wikipedia-mcp

# Optional: Secure incoming MCP network requests with static bearer token
wikipedia-mcp --transport http --auth-mode static --auth-token your_mcp_token --host 0.0.0.0 --port 8080

# Optional: Secure incoming MCP network requests with JWT validation
wikipedia-mcp --transport http --auth-mode jwt --auth-jwks-uri https://issuer/.well-known/jwks.json --auth-issuer https://issuer

# Security note: prefer http/streamable-http + auth-mode for exposed network transport.

# Combine options
wikipedia-mcp --country Taiwan --enable-cache --access-token your_wikipedia_token --transport http --path /mcp --port 8080

### Docker/Kubernetes

When running inside containers, bind the HTTP MCP server to all interfaces and map
the container port to the host or service:

```bash
# Build and run with Docker
docker build -t wikipedia-mcp .
docker run --rm -p 8080:8080 wikipedia-mcp --transport http --host 0.0.0.0 --port 8080 --path /mcp

Kubernetes example (minimal):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wikipedia-mcp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wikipedia-mcp
  template:
    metadata:
      labels:
        app: wikipedia-mcp
    spec:
      containers:
        - name: server
          image: your-repo/wikipedia-mcp:latest
          args: ["--transport", "http", "--host", "0.0.0.0", "--port", "8080", "--path", "/mcp"]
          ports:
            - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: wikipedia-mcp
spec:
  selector:
    app: wikipedia-mcp
  ports:
    - name: http
      port: 8080
      targetPort: 8080

### Configuration for Claude Desktop

Add the following to your Claude Desktop configuration file:

**Option 1: Using command name (requires `wikipedia-mcp` to be in PATH)**
```json
{
  "mcpServers": {
    "wikipedia": {
      "command": "wikipedia-mcp"
    }
  }
}

Option 2: Using full path (recommended if you get connection errors)

{
  "mcpServers": {
    "wikipedia": {
      "command": "/full/path/to/wikipedia-mcp"
    }
  }
}

Option 3: With country/language specification

{
  "mcpServers": {
    "wikipedia-us": {
      "command": "wikipedia-mcp",
      "args": ["--country", "US"]
    },
    "wikipedia-taiwan": {
      "command": "wikipedia-mcp", 
      "args": ["--country", "TW"]
    },
    "wikipedia-japan": {
      "command": "wikipedia-mcp",
      "args": ["--country", "Japan"]
    }
  }
}

To find the full path, run: which wikipedia-mcp

Configuration file locations: - macOS: ~/Library/Application Support/Claude/claude_desktop_config.json - Windows: %APPDATA%/Claude/claude_desktop_config.json - Linux: ~/.config/Claude/claude_desktop_config.json

Note: If you encounter connection errors, see the Troubleshooting section for solutions.

Documentation Index

Available MCP Tools

The Wikipedia MCP server provides the following tools for LLMs to interact with Wikipedia:

Each tool is also exposed with a wikipedia_-prefixed alias (for example, wikipedia_get_article) for improved cross-server discoverability.

search_wikipedia

Search Wikipedia for articles matching a query.

Parameters: - query (string): The search term - limit (integer, optional): Maximum number of results to return (default: 10)

Returns: - A list of search results with titles, snippets, and metadata

get_article

Get the full content of a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article

Returns: - Article content including text, summary, sections, links, and categories

get_summary

Get a concise summary of a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article

Returns: - A text summary of the article

get_sections

Get the sections of a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article

Returns: - A structured list of article sections with their content

get_links

Get the links contained within a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article

Returns: - A list of links to other Wikipedia articles

get_coordinates

Get the coordinates of a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article

Returns: - A dictionary containing coordinate information including: - title: The article title - pageid: The page ID - coordinates: List of coordinate objects with latitude, longitude, and metadata - exists: Whether the article exists - error: Any error message if retrieval failed

get_related_topics

Get topics related to a Wikipedia article based on links and categories.

Parameters: - title (string): The title of the Wikipedia article - limit (integer, optional): Maximum number of related topics (default: 10)

Returns: - A list of related topics with relevance information

summarize_article_for_query

Get a summary of a Wikipedia article tailored to a specific query.

Parameters: - title (string): The title of the Wikipedia article - query (string): The query to focus the summary on - max_length (integer, optional): Maximum length of the summary (default: 250)

Returns: - A dictionary containing the title, query, and the focused summary

summarize_article_section

Get a summary of a specific section of a Wikipedia article.

Parameters: - title (string): The title of the Wikipedia article - section_title (string): The title of the section to summarize - max_length (integer, optional): Maximum length of the summary (default: 150)

Returns: - A dictionary containing the title, section title, and the section summary

extract_key_facts

Extract key facts from a Wikipedia article, optionally focused on a specific topic within the article.

Parameters: - title (string): The title of the Wikipedia article - topic_within_article (string, optional): A specific topic within the article to focus fact extraction - count (integer, optional): Number of key facts to extract (default: 5)

Returns: - A dictionary containing the title, topic, and a list of extracted facts

Country/Locale Support

The Wikipedia MCP server supports intuitive country and region codes as an alternative to language codes. This makes it easier to access region-specific Wikipedia content without needing to know language codes.

Supported Countries and Regions

Use --list-countries to see all supported countries:

wikipedia-mcp --list-countries

This will display countries organized by language, for example:

Supported Country/Locale Codes:
========================================
    en: US, USA, United States, UK, GB, Canada, Australia, ...
    zh-hans: CN, China
    zh-tw: TW, Taiwan  
    ja: JP, Japan
    de: DE, Germany
    fr: FR, France
    es: ES, Spain, MX, Mexico, AR, Argentina, ...
    pt: PT, Portugal, BR, Brazil
    ru: RU, Russia
    ar: SA, Saudi Arabia, AE, UAE, EG, Egypt, ...

Usage Examples

# Major countries by code
wikipedia-mcp --country US       # United States (English)
wikipedia-mcp --country CN       # China (Simplified Chinese)
wikipedia-mcp --country TW       # Taiwan (Traditional Chinese)
wikipedia-mcp --country JP       # Japan (Japanese)
wikipedia-mcp --country DE       # Germany (German)
wikipedia-mcp --country FR       # France (French)
wikipedia-mcp --country BR       # Brazil (Portuguese)
wikipedia-mcp --country RU       # Russia (Russian)

# Countries by full name (case insensitive)
wikipedia-mcp --country "United States"
wikipedia-mcp --country China
wikipedia-mcp --country Taiwan  
wikipedia-mcp --country Japan
wikipedia-mcp --country Germany
wikipedia-mcp --country france    # Case insensitive

# Regional variants
wikipedia-mcp --country HK       # Hong Kong (Traditional Chinese)
wikipedia-mcp --country SG       # Singapore (Simplified Chinese)
wikipedia-mcp --country "Saudi Arabia"  # Arabic
wikipedia-mcp --country Mexico   # Spanish

Country-to-Language Mapping

The server automatically maps country codes to appropriate Wikipedia language editions:

  • English-speaking: US, UK, Canada, Australia, New Zealand, Ireland, South Africa β†’ en
  • Chinese regions:
  • CN, China β†’ zh-hans (Simplified Chinese)
  • TW, Taiwan β†’ zh-tw (Traditional Chinese - Taiwan)
  • HK, Hong Kong β†’ zh-hk (Traditional Chinese - Hong Kong)
  • SG, Singapore β†’ zh-sg (Simplified Chinese - Singapore)
  • Major languages: JPβ†’ja, DEβ†’de, FRβ†’fr, ESβ†’es, ITβ†’it, RUβ†’ru, etc.
  • Regional variants: Supports 140+ countries and regions

Error Handling

If you specify an unsupported country, you'll get a helpful error message:

$ wikipedia-mcp --country INVALID
Error: Unsupported country/locale: 'INVALID'. 
Supported country codes include: US, USA, UK, GB, CA, AU, NZ, IE, ZA, CN. 
Use --language parameter for direct language codes instead.

Use --list-countries to see supported country codes.

Language Variants

The Wikipedia MCP server supports language variants for languages that have multiple writing systems or regional variations. This feature is particularly useful for Chinese, Serbian, Kurdish, and other languages with multiple scripts or regional differences.

Supported Language Variants

Chinese Language Variants

  • zh-hans - Simplified Chinese
  • zh-hant - Traditional Chinese
  • zh-tw - Traditional Chinese (Taiwan)
  • zh-hk - Traditional Chinese (Hong Kong)
  • zh-mo - Traditional Chinese (Macau)
  • zh-cn - Simplified Chinese (China)
  • zh-sg - Simplified Chinese (Singapore)
  • zh-my - Simplified Chinese (Malaysia)

Serbian Language Variants

  • sr-latn - Serbian Latin script
  • sr-cyrl - Serbian Cyrillic script

Kurdish Language Variants

  • ku-latn - Kurdish Latin script
  • ku-arab - Kurdish Arabic script

Norwegian Language Variants

  • no - Norwegian (automatically mapped to BokmΓ₯l)

Usage Examples

# Access Simplified Chinese Wikipedia
wikipedia-mcp --language zh-hans

# Access Traditional Chinese Wikipedia (Taiwan)
wikipedia-mcp --language zh-tw

# Access Serbian Wikipedia in Latin script
wikipedia-mcp --language sr-latn

# Access Serbian Wikipedia in Cyrillic script
wikipedia-mcp --language sr-cyrl

How Language Variants Work

When you specify a language variant like zh-hans, the server: 1. Maps the variant to the base Wikipedia language (e.g., zh for Chinese variants) 2. Uses the base language for API connections to the Wikipedia servers 3. Includes the variant parameter in API requests to get content in the specific variant 4. Returns content formatted according to the specified variant's conventions

This approach ensures optimal compatibility with Wikipedia's API while providing access to variant-specific content and formatting.

Example Prompts

Once the server is running and configured with Claude Desktop, you can use prompts like:

General Wikipedia queries:

  • "Tell me about quantum computing using the Wikipedia information."
  • "Summarize the history of artificial intelligence based on Wikipedia."
  • "What does Wikipedia say about climate change?"
  • "Find Wikipedia articles related to machine learning."
  • "Get me the introduction section of the article on neural networks from Wikipedia."
  • "What are the coordinates of the Eiffel Tower?"
  • "Find the latitude and longitude of Mount Everest from Wikipedia."
  • "Get coordinate information for famous landmarks in Paris."

Using country-specific Wikipedia:

  • "Search Wikipedia China for information about the Great Wall." (uses Chinese Wikipedia)
  • "Tell me about Tokyo from Japanese Wikipedia sources."
  • "What does German Wikipedia say about the Berlin Wall?"
  • "Find information about the Eiffel Tower from French Wikipedia."
  • "Get Taiwan Wikipedia's article about Taiwanese cuisine."

Language variant examples:

  • "Search Traditional Chinese Wikipedia for information about Taiwan."
  • "Find Simplified Chinese articles about modern China."
  • "Get information from Serbian Latin Wikipedia about Belgrade."

MCP Resources

The server also provides MCP resources (similar to HTTP endpoints but for MCP):

  • search/{query}: Search Wikipedia for articles matching the query
  • article/{title}: Get the full content of a Wikipedia article
  • summary/{title}: Get a summary of a Wikipedia article
  • sections/{title}: Get the sections of a Wikipedia article
  • links/{title}: Get the links in a Wikipedia article
  • coordinates/{title}: Get the coordinates of a Wikipedia article
  • summary/{title}/query/{query}/length/{max_length}: Get a query-focused summary of an article
  • summary/{title}/section/{section_title}/length/{max_length}: Get a summary of a specific article section
  • facts/{title}/topic/{topic_within_article}/count/{count}: Extract key facts from an article

Development

Local Development Setup

# Clone the repository
git clone https://github.com/rudra-ravi/wikipedia-mcp.git
cd wikipedia-mcp

# Create a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install the package in development mode
pip install -e .

# Install development and test dependencies
pip install -r requirements-dev.txt

# Run the server
wikipedia-mcp

Project Structure

  • wikipedia_mcp/: Main package
  • __main__.py: Entry point for the package
  • server.py: MCP server implementation
  • wikipedia_client.py: Wikipedia API client
  • api/: API implementation
  • core/: Core functionality
  • utils/: Utility functions
  • tests/: Test suite
  • test_basic.py: Basic package tests
  • test_cli.py: Command-line interface tests
  • test_server_tools.py: Comprehensive server and tool tests

Testing

The project includes a comprehensive test suite to ensure reliability and functionality.

Test Structure

The test suite is organized in the tests/ directory with the following test files:

  • test_basic.py: Basic package functionality tests
  • test_cli.py: Command-line interface and transport tests
  • test_server_tools.py: Comprehensive tests for all MCP tools and Wikipedia client functionality

Running Tests

Run All Tests

# Install test dependencies
pip install -r requirements-dev.txt

# Run all tests
python -m pytest tests/ -v

# Run tests with coverage
python -m pytest tests/ --cov=wikipedia_mcp --cov-report=html

Run Specific Test Categories

# Run only unit tests (excludes integration tests)
python -m pytest tests/ -v -m "not integration"

# Run only integration tests (requires internet connection)
python -m pytest tests/ -v -m "integration"

# Run specific test file
python -m pytest tests/test_server_tools.py -v

Test Categories

Unit Tests

  • WikipediaClient Tests: Mock-based tests for all client methods
  • Search functionality
  • Article retrieval
  • Summary extraction
  • Section parsing
  • Link extraction
  • Related topics discovery
  • Server Tests: MCP server creation and tool registration
  • CLI Tests: Command-line interface functionality

Integration Tests

  • Real API Tests: Tests that make actual calls to Wikipedia API
  • End-to-End Tests: Complete workflow testing

Test Configuration

The project uses pytest.ini for test configuration:

[pytest]
markers =
    integration: marks tests as integration tests (may require network access)
    slow: marks tests as slow running

testpaths = tests
addopts = -v --tb=short

Continuous Integration

All tests are designed to: - Run reliably in CI/CD environments - Handle network failures gracefully - Provide clear error messages - Cover edge cases and error conditions

Adding New Tests

When contributing new features:

  1. Add unit tests for new functionality
  2. Include both success and failure scenarios
  3. Mock external dependencies (Wikipedia API)
  4. Add integration tests for end-to-end validation
  5. Follow existing test patterns and naming conventions

Troubleshooting

Common Issues

Claude Desktop Connection Issues

Problem: Claude Desktop shows errors like spawn wikipedia-mcp ENOENT or cannot find the command.

Cause: This occurs when the wikipedia-mcp command is installed in a user-specific location (like ~/.local/bin/) that's not in Claude Desktop's PATH.

Solutions:

  1. Use full path to the command (Recommended): json { "mcpServers": { "wikipedia": { "command": "/home/username/.local/bin/wikipedia-mcp" } } }

To find your exact path, run: which wikipedia-mcp

  1. Install with pipx for global access: bash pipx install wikipedia-mcp Then use the standard configuration: json { "mcpServers": { "wikipedia": { "command": "wikipedia-mcp" } } }

  2. Create a symlink to a global location: bash sudo ln -s ~/.local/bin/wikipedia-mcp /usr/local/bin/wikipedia-mcp

Other Issues

  • Article Not Found: Check the exact spelling of article titles
  • Rate Limiting: Wikipedia API has rate limits; consider adding delays between requests
  • Large Articles: Some Wikipedia articles are very large and may exceed token limits

Troubleshooting Search Issues

If you're experiencing empty search results, use the new diagnostic tools:

1. Test Connectivity

Use the test_wikipedia_connectivity tool to check if you can reach Wikipedia's API:

{
  "tool": "test_wikipedia_connectivity"
}

This returns diagnostics including: - Connection status (success or failed) - Response time in milliseconds - Site/host information when successful - Error details when connectivity fails

2. Enhanced Search Error Information

The search_wikipedia tool now returns detailed metadata:

{
  "tool": "search_wikipedia",
  "arguments": {
    "query": "Ada Lovelace",
    "limit": 10
  }
}

Example response:

{
  "query": "Ada Lovelace",
  "results": [...],
  "count": 5,
  "status": "success",
  "language": "en"
}

When no results are found, you receive:

{
  "query": "nonexistent",
  "results": [],
  "status": "no_results",
  "count": 0,
  "language": "en",
  "message": "No search results found. This could indicate connectivity issues, API errors, or simply no matching articles."
}

3. Common Search Issues and Solutions

  • Empty results: Run the connectivity test, verify query spelling, try broader terms.
  • Connection errors: Check firewall or proxy settings, ensure *.wikipedia.org is reachable.
  • API limits: Requests with limit > 500 are automatically capped; negative values reset to the default (10).

4. Debugging with Verbose Logging

Launch the server with debug logging for deeper insight:

wikipedia-mcp --log-level DEBUG

This emits the request parameters, response status codes, and any warnings returned by the API.

Understanding the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is not a traditional HTTP API but a specialized protocol for communication between LLMs and external tools. Key characteristics:

  • Uses stdio for local integrations and streamable HTTP for network integrations (sse retained for legacy compatibility)
  • Designed specifically for AI model interaction
  • Provides standardized formats for tools, resources, and prompts
  • Integrates directly with Claude and other MCP-compatible AI systems

Claude Desktop acts as the MCP client, while this server provides the tools and resources that Claude can use to access Wikipedia information.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Connect with the Author

search_wikipedia Search Wikipedia for articles matching a query. Parameters: query: The search term to look up on Wikipedia. limit: Maximum number of results to return (1-500). Returns a dictionary with the search query, results, status, and additional metadata. If the query is empty or invalid, the status will be 'error' and an explanatory message is included.

Parameters

query required
limit
wikipedia_search_wikipedia Search Wikipedia for articles matching a query. Parameters: query: The search term to look up on Wikipedia. limit: Maximum number of results to return (1-500). Returns a dictionary with the search query, results, status, and additional metadata. If the query is empty or invalid, the status will be 'error' and an explanatory message is included.

Parameters

query required
limit
test_wikipedia_connectivity Provide diagnostics for Wikipedia API connectivity. Returns the base API URL, language, site information, and response time in milliseconds. If connectivity fails, status will be 'failed' with error details.
wikipedia_test_wikipedia_connectivity Provide diagnostics for Wikipedia API connectivity. Returns the base API URL, language, site information, and response time in milliseconds. If connectivity fails, status will be 'failed' with error details.
get_article Get the full content of a Wikipedia article. Returns a dictionary containing article details or an error message if retrieval fails.

Parameters

title required
wikipedia_get_article Get the full content of a Wikipedia article. Returns a dictionary containing article details or an error message if retrieval fails.

Parameters

title required
get_summary Get a summary of a Wikipedia article. Returns a dictionary with the title and summary string. On error, includes an error message instead of a summary.

Parameters

title required
wikipedia_get_summary Get a summary of a Wikipedia article. Returns a dictionary with the title and summary string. On error, includes an error message instead of a summary.

Parameters

title required
summarize_article_for_query Get a summary of a Wikipedia article tailored to a specific query. The summary is a snippet around the query within the article text or summary. The max_length parameter controls the length of the snippet.

Parameters

title required
query required
max_length
wikipedia_summarize_article_for_query Get a summary of a Wikipedia article tailored to a specific query. The summary is a snippet around the query within the article text or summary. The max_length parameter controls the length of the snippet.

Parameters

title required
query required
max_length
summarize_article_section Get a summary of a specific section of a Wikipedia article. Returns a dictionary containing the section summary or an error.

Parameters

title required
section_title required
max_length
wikipedia_summarize_article_section Get a summary of a specific section of a Wikipedia article. Returns a dictionary containing the section summary or an error.

Parameters

title required
section_title required
max_length
extract_key_facts Extract key facts from a Wikipedia article, optionally focused on a topic. Returns a dictionary containing a list of facts.

Parameters

title required
topic_within_article
count
wikipedia_extract_key_facts Extract key facts from a Wikipedia article, optionally focused on a topic. Returns a dictionary containing a list of facts.

Parameters

title required
topic_within_article
count
get_related_topics Get topics related to a Wikipedia article based on links and categories. Returns a list of related topics up to the specified limit.

Parameters

title required
limit
wikipedia_get_related_topics Get topics related to a Wikipedia article based on links and categories. Returns a list of related topics up to the specified limit.

Parameters

title required
limit
get_sections Get the sections of a Wikipedia article. Returns a dictionary with the article title and list of sections.

Parameters

title required
wikipedia_get_sections Get the sections of a Wikipedia article. Returns a dictionary with the article title and list of sections.

Parameters

title required
get_links Get the links contained within a Wikipedia article. Returns a dictionary with the article title and list of links.

Parameters

title required
wikipedia_get_links Get the links contained within a Wikipedia article. Returns a dictionary with the article title and list of links.

Parameters

title required
get_coordinates Get the coordinates of a Wikipedia article. Returns a dictionary containing coordinate information.

Parameters

title required
wikipedia_get_coordinates Get the coordinates of a Wikipedia article. Returns a dictionary containing coordinate information.

Parameters

title required
🚫
Security Tier
Reject
55
Score
out of 100
Scanned by
Orcorus Security Scanner
Mar 13, 2026

Security Review

Integration: Wikipedia
Repository: https://github.com/Rudra-ravi/wikipedia-mcp
Commit: latest
Scan Date: 2026-03-13 16:45 UTC

Security Score

55 / 100

Tier Classification

Reject

OWASP Alignment

OWASP Rubric

  • Standard: OWASP Top 10 (2021) aligned review
  • Core methodology: architecture context, trust boundaries, data-flow tracing, threat modeling, control verification, and evidence-backed validation
  • Key characteristics considered: exploitability, impact, likelihood, attacker preconditions, and business context

OWASP Security Category Mapping

  • A01 Broken Access Control: none
  • A02 Cryptographic Failures: 7 finding(s)
  • A03 Injection: none
  • A04 Insecure Design: none
  • A05 Security Misconfiguration: none
  • A06 Vulnerable and Outdated Components: none
  • A07 Identification and Authentication Failures: 28 finding(s)
  • A08 Software and Data Integrity Failures: none
  • A09 Security Logging and Monitoring Failures: none
  • A10 Server-Side Request Forgery: none

Static Analysis Findings (Bandit)

High Severity

None

Medium Severity

  • Possible binding to all interfaces. in tests/test_docker_compatibility.py:30 (confidence: MEDIUM)
  • Possible binding to all interfaces. in tests/test_new_features.py:141 (confidence: MEDIUM)

Low Severity

  • Consider possible security implications associated with the subprocess module. in test_build.py:6 (confidence: HIGH)
  • subprocess call - check for execution of untrusted input. in test_build.py:17 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:18 (confidence: HIGH)
  • Possible hardcoded password: 'test_token_123' in tests/test_access_token.py:22 (confidence: MEDIUM)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:24 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:31 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:32 (confidence: HIGH)
  • Possible hardcoded password: 'test_token_123' in tests/test_access_token.py:36 (confidence: MEDIUM)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:40 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:41 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:42 (confidence: HIGH)
  • Possible hardcoded password: 'test_token_123' in tests/test_access_token.py:47 (confidence: MEDIUM)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:77 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:78 (confidence: HIGH)
  • Possible hardcoded password: 'test_token_123' in tests/test_access_token.py:83 (confidence: MEDIUM)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:118 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:119 (confidence: HIGH)
  • Possible hardcoded password: 'secret_token_123' in tests/test_access_token.py:127 (confidence: MEDIUM)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:132 (confidence: HIGH)
  • Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. in tests/test_access_token.py:141 (confidence: HIGH)

Build Status

SKIPPED

Build step was skipped to avoid running untrusted build commands by default.

Tests

Detected (pytest)

Documentation

README: Present
Dependency file: Present

AI Security Review

OWASP-Aligned Security Review Report for repository: Wikipedia (wikipedia-mcp)

1) OWASP Review Methodology Applied
- Orientation: Examined project structure and prioritized files (wikipedia_mcp/server.py, wikipedia_mcp/wikipedia_client.py, wikipedia_mcp/auth_config.py, wikipedia_mcp/main.py, wikipedia_mcp/schemas.py, requirements.txt, Dockerfile, tests).
- Entry points: Read CLI entrypoint (wikipedia_mcp/main.py) and server factory (wikipedia_mcp/server.py) to understand exposed transports and auth.
- Data flows: Traced untrusted inputs (CLI args, HTTP path parameters, MCP tool arguments) into WikipediaClient and network calls (requests.get) and ASGI middleware.
- Attack surface: Identified network endpoints, auth handling (static bearer & JWT provider creation), request-building and header handling, caching, and CLI/ENV handling of secrets.
- Threat modeling / controls verification: Mapped findings to OWASP Top 10 categories, assessed exploitability and impact, and verified defensive controls (e.g., token logging safeguards, input trimming, parameter caps).
- Validation: Verified findings against concrete code locations and test coverage where present.

2) OWASP Top-10 (2021) Mapping
- A01 Broken Access Control: StaticBearerAuthMiddleware design and lack of brute-force protections, token handling.
- A02 Cryptographic Failures: (observations around JWT configuration and potential misconfigurationβ€”no direct crypto ops in repo but reliant on third-party provider).
- A03 Injection: No injection (SQL/OS/command) found in production code. Tests invoke subprocess (expected).
- A04 Insecure Design: Missing rate limiting, lack of transport-level TLS enforcement guidance, lack of auth brute-force protections, potential resource-exhaustion via API abuse.
- A05 Security Misconfiguration: CLI token via argv, Dockerfile/port mismatch, default network binding behavior and deployment docs.
- A06 Vulnerable and Outdated Components: External dependencies (fastmcp, requests, wikipedia-api) need SCA/patching; requirements.txt uses permissive version ranges.
- A07 Identification and Authentication Failures: Use of static tokens, comparison method, possible exposure of tokens via CLI or environment.
- A08 Software and Data Integrity Failures: No signing/verification for code artifacts; dependency integrity not enforced.
- A09 Security Logging and Monitoring Failures: No rate-limiting or monitoring hooks; limited audit logging of auth failures.
- A10 Server-Side Request Forgery (SSRF): No direct SSRF; api_url is constructed as "https://{base_language}.wikipedia.org/..." which prevents arbitrary host control.

3) Critical Vulnerabilities (RCE/SQLi/unsafe deserialization): NONE found in production code
- No use of eval/exec/compile/pickle or subprocess in production code. All network calls are via requests.get to controlled API endpoints.

4) High Severity Issues
4.1 Credential exposure via CLI arguments (medium-high)
- Files/locations: wikipedia_mcp/main.py (parser.register for --access-token) and usage at assignment: access_token = args.access_token or os.getenv("WIKIPEDIA_ACCESS_TOKEN").
- Issue: Passing secrets on the command line (e.g., --access-token, --auth-token) exposes them in process listings (ps) and shell history on many systems. The code supports and tests CLI token usage (tests/test_access_token.py).
- OWASP mapping: A07 Identification and Authentication Failures, A05 Security Misconfiguration.
- Exploitability: Easy (local attacker or co-tenant in shared hosting) to read process args.
- Impact: Exposure of Wikipedia access token or MCP static token, which could be used to increase rate limits or access protected resources.
- Remediation: Document and strongly recommend using environment variables instead of CLI args for secrets. Add an explicit warning in --help and docs. Prefer reading secrets from stdin or a file with restrictive permissions. Example change: in main.py, add a deprecation/warning and avoid allowing --auth-token/--access-token by default (or accept but warn). References: main.py (parser.add_argument --access-token block ~lines 118-132) and auth_config.build_auth_config handling of auth_token (auth_config.py lines ~69-90).

4.2 Static bearer token comparison is not constant-time (low-medium)
- File/line: wikipedia_mcp/server.py (StaticBearerAuthMiddleware.init and call, approx lines 47-66).
- Issue: Authorization header compared with expected string using direct equality (authorization != self._expected). This may allow timing attacks to guess tokens in high-value deployments. Also no lockout or throttling for repeated invalid attempts.
- OWASP mapping: A07, A04.
- Exploitability: Low in most deployments (requires network access to auth-protected endpoint and precise timing measurement). Higher if deployed on same host/fast network.
- Impact: Disclosure of static token leading to unauthorized access to MCP API.
- Remediation: Use hmac.compare_digest to compare tokens to mitigate timing leakage and add throttling/lockout or rate-limiting middleware for network transports. E.g., replace authorization != self._expected with not hmac.compare_digest(authorization or "", self._expected).

4.3 Lack of rate limiting / brute force protections (medium)
- Files: server.py (the ASGI server creation and middleware insertion points) and create_server where middleware is set via build_http_middleware.
- Issue: No rate limiting, per-IP authentication failure logging, or integration point for WAF. This permits credential guessing and API abuse (amplified by absence of transport-level TLS enforcement guidance).
- OWASP mapping: A04, A09.
- Exploitability: High (internet-exposed service) if deployed publicly.
- Impact: Denial of service to downstream Wikipedia API keys, rate limit exhaustion of the integration, potential service disruption.
- Remediation: Add an optional rate-limiting middleware (per-IP, per-token) for network transports; add exponential backoff and logging on authentication failures. document recommended deployment behind TLS-terminating reverse proxy with client IP preservation and WAF.

5) Medium Severity Issues
5.1 permissive dependency spec and supply-chain risk (medium)
- File: requirements.txt (fastmcp>=2.3.0, wikipedia-api>=0.8.0, requests>=2.31.0, python-dotenv>=1.0.0).
- Issue: Broad >= ranges may allow pulling vulnerable versions; no pinned hashes (no pip-compile/poetry lock). No SBOM provided.
- OWASP mapping: A06, A08.
- Remediation: Pin exact vetted versions in production deployments, publish an SBOM, and run SCA checks regularly. Add CI SCA scanning.

5.2 Dockerfile / deployment minor misconfigurations
- File: Dockerfile
- Observations: EXPOSE 8080 but MCP server default port is 8000; ENTRYPOINT uses wikipedia-mcp CLI defaulting to stdio transport which is not appropriate inside containers by default. No USER instruction β€” container runs as root by default.
- OWASP mapping: A05.
- Impact: Operational confusion and potential insecure container defaults.
- Remediation: Align EXPOSE with runtime, document intended container usage, and add a non-root USER and minimal filesystem permissions.

5.3 Lack of TLS guidance / enforcement for network transport (medium)
- Files: main.py server.run for http transport; no TLS handling inside the server.
- Issue: The server accepts network transport without explicit guidance to terminate TLS at reverse proxy. Running the Python process directly on 0.0.0.0:8000 would be plaintext.
- OWASP mapping: A02, A05.
- Remediation: Document that the server must be deployed behind HTTPS termination (nginx, cloud load balancer). Consider providing an option or middleware to enforce TLS headers or reject non-TLS when run with a publicly bound host.

6) Low Severity Issues / Best-practice gaps
6.1 Timing comparison (see 4.2) β€” low in many setups.
6.2 Logging content safety: Tests assert tokens are not logged; code appears to avoid logging Authorization headers. Continue to audit future logging additions.
6.3 Input length checks are present in some places but not everywhere: titles passed to wikipediaapi.page are trimmed in some functions but not always; consider enforcing length caps or validating characters. (Files: wikipedia_mcp/wikipedia_client.py search: trimming done for queries; title inputs often passed unchanged.)
6.4 Potential information leakage via error body_preview returned in _request_json on HTTP errors β€” contains up to 200 chars of remote response. If upstream Wikipedia responses contain sensitive headers/content, they may appear in diagnostics. This is low risk for Wikipedia but note for other upstreams.

7) Key Risk Characteristics (for prioritized findings)
- Credential exposure via CLI args
- Exploitability: High locally (ps/equivalent), low remote. Requires access to same host/process table.
- Impact: Medium-High (access token compromise).
- Preconditions: Attacker with ability to enumerate process table or access to container metadata.

  • Static bearer token timing comparison
  • Exploitability: Low to medium (requires repeated measurements and network access).
  • Impact: Medium (token disclosure leads to unauthorized access).
  • Preconditions: Attacker network access to service, ability to measure timings.

  • No rate limiting

  • Exploitability: High if service is internet-accessible.
  • Impact: High (API abuse/DoS, cost/rate-limit exhaustion).
  • Preconditions: Service exposed to untrusted networks.

  • Dependency & supply-chain risk

  • Exploitability: Medium (depends on vulnerabilities in third-party packages; SCA needed).
  • Impact: High if any dependency contains severe vulnerabilities.
  • Preconditions: Using outdated/vulnerable dependency versions.

8) Positive Security Practices Observed
- Authorization headers are intentionally not logged by tests; code avoids printing tokens in standard informational logs.
- Input validation exists for search queries (trimming, length cap 300) and limits for 'limit' parameters with warnings.
- API calls have bounded retries and timeout handling in _request_json with backoff logic, protecting against some transient failures.
- Many operations catch exceptions and return safe structured error objects (avoids stacktrace leakage to clients).
- Tests cover auth modes, token handling, and logging behaviors demonstrating security awareness.

9) Recommendations (concrete fixes with file:line references)
Note: line numbers are approximate and refer to the files listed.

9.1 Protect secrets from CLI exposure (HIGH / A07)
- Files: wikipedia_mcp/main.py (parser.add_argument --access-token at ~lines 118-130; assignment at ~line 244).
- Change: Deprecate or strongly warn about supplying secrets via CLI. Modify CLI help text to explicitly recommend environment variables or files; optionally disable --access-token and read from WIKIPEDIA_ACCESS_TOKEN only.
- Example remediation: In main.py update help: "(Sensitive β€” prefer WIKIPEDIA_ACCESS_TOKEN env var or read from file/stdin)" and add runtime warning if --access-token is used (log at WARNING). Also document in README and PUBLISHING_GUIDE.md.

9.2 Use constant-time comparison for bearer tokens (MEDIUM / A07)
- File: wikipedia_mcp/server.py (StaticBearerAuthMiddleware.call, approx 47-66)
- Change: Replace equality check with hmac.compare_digest to prevent timing attacks. Also ensure authorization header is normalized and defaulted to empty string when missing.
- Code sample: import hmac / compare_digest(authorization or "", self._expected).

9.3 Add rate-limiting and lockout for network transports (MEDIUM / A04)
- Files: wikipedia_mcp/server.py (integration point, build_http_middleware); optionally provide an optional param to create_server to accept middleware or integrate a simple in-process rate limiter.
- Change: Add an optional middleware (token/IP-based) that limits requests per minute and throttles authentication failures. For production, recommend deploying behind reverse proxy rate limiting (nginx/Cloud LB) and WAF.

9.4 Enforce/advise TLS and secure deployment defaults (MEDIUM / A02, A05)
- Files: README.md, Dockerfile, and main.py
- Change: Document that public deployments must be behind TLS termination. In Dockerfile, set USER to non-root, expose default port used by server (8000) or clarify intended port mapping. Consider adding environment variables to force TLS-only operation or to require server to be run behind a proxy.

9.5 Harden dependency management and supply-chain (MEDIUM / A06)
- Files: requirements.txt, package metadata
- Change: Pin exact dependency versions (pip freeze or use poetry/poetry.lock), add a requirements.lock or constraints file with hashes (pip-compile --generate-hashes), add CI SCA checks (dependabot, snyk), and publish an SBOM.

9.6 Sanitize and minimize error previews (LOW / A09)
- File: wikipedia_mcp/wikipedia_client.py _request_json (preview returned in body_preview up to 200 chars)
- Change: Remove or reduce returned upstream body previews in error responses, or ensure they are sanitized to avoid leaking sensitive upstream data in diagnostics.

9.7 Improve logging and monitoring for auth/failed requests (LOW / A09)
- Files: server.py and wikipedia_mcp/main.py
- Change: Add structured logging for auth failures (without sensitive data), count/frequency metrics, and integration points for alerting/log aggregation.

9.8 Use secure default user in Dockerfile (LOW / A05)
- File: Dockerfile
- Change: Add a non-root USER with minimal permissions and avoid running as root. Ensure container uses the port the server listens on or adjust CMD/ENTRYPOINT to run network transport with proper args when containerized.

10) Next Tier Upgrade Plan (Bronze / Silver / Gold / Reject)
- Current tier assessment: Silver-leaning Bronze.
- Rationale: The integration has solid input validation, error handling, test coverage including auth behavior, and deliberate design around not logging tokens. However, it lacks operational hardening (rate limiting, TLS enforcement guidance), has CLI token exposure, permissive dependency specs, and some middleware hygiene items.

  • Target next tier: Gold (production-ready) β€” prioritized actions to attain Gold:
  • High Priority (must do):
    • Stop recommending CLI tokens: update CLI docs/help and prefer env vars or files (fix main.py).
    • Use constant-time comparison for static tokens in StaticBearerAuthMiddleware (server.py).
    • Implement or document mandatory TLS termination and deployment guidance; update Dockerfile to non-root user.
  • Medium Priority:
    • Add rate limiting middleware and authentication failure throttling for network transports.
    • Pin dependency versions and add SCA/CI scanning and SBOM.
  • Low Priority:

    • Improve auth/failed request logging (structured, non-sensitive), add metrics hooks.
    • Harden error body_preview sanitization in _request_json.
  • Example prioritized TODOs with estimated effort:

  • Immediate (1-3 days): Replace string compare with hmac.compare_digest; update CLI help to warn against passing secrets on argv; document TLS requirement.
  • Short term (1-2 weeks): Add a simple in-process rate limiter middleware and auth-failure logging, update Dockerfile to use non-root user and align EXPOSE.
  • Medium term (2-4 weeks): Pin dependencies, add CI SCA scans and SBOM, introduce optional stricter configuration flags (e.g., enforce-jwt-only mode), and test boolean hardening under CI.

11) Concrete findings summary (file, approximate line, severity, remediation)
- wikipedia_mcp/main.py: ~lines 118-132 (parser.add_argument --access-token) and ~244 (access_token assignment) β€” Severity: HIGH (Credential exposure via CLI). Remediation: Remove or deprecate CLI secret args, warn user, prefer env vars, document best practice. (A07 / A05)
- wikipedia_mcp/server.py: ~lines 47-66 (StaticBearerAuthMiddleware) β€” Severity: MEDIUM (non-constant time token compare; no throttling). Remediation: Use hmac.compare_digest and add rate-limiting / throttling for auth failures. (A07 / A04)
- wikipedia_mcp/wikipedia_client.py: _request_json returns body_preview up to 200 chars on HTTP errors β€” Severity: LOW. Remediation: sanitize or truncate preview further. (A09)
- requirements.txt β€” Severity: MEDIUM (supply-chain). Remediation: pin exact versions, provide hashes, add SCA.
- Dockerfile β€” Severity: LOW (EXPOSE mismatch & run-as-root). Remediation: align EXPOSE with server port; set a non-root USER and document container usage. (A05)
- create_server / server.run usage β€” Severity: MEDIUM (no rate-limiting or authentication failure logging) β€” Remediation: add middleware for rate limiting and structured auth logs. (A04 / A09)

12) Final Assessment & Recommendation
- Overall security posture: The integration is well-structured and does not contain high-risk coding vulnerabilities (no RCE, injection, unsafe deserialization). The primary risks are operational/configuration (credential exposure via CLI, missing deployment/TLS guidance, lack of rate limiting) and dependency management. Fixing the highlighted items will significantly reduce attack surface and raise the integration to a Gold-ready posture for production use.

If you want, I can prepare a short patch (diff) implementing the most critical fixes: use hmac.compare_digest in StaticBearerAuthMiddleware and add CLI warnings about using --access-token. I can also draft a rate-limiting ASGI middleware sample for inclusion.

-- End of security review --

Summary

Security Score: 55/100 (Reject)
Static analysis found 0 high, 2 medium, and 460 low severity issues.
Build step skipped for safety.
Tests detected.

0.0
0 reviews
5
0%
4
0%
3
0%
2
0%
1
0%

Sign in to leave a review

No reviews yet β€” be the first!

Connect β†’
0.0
β˜… Rating
22
Tools
0
Installs

Configuration

Docker Image

Docker Hub
mcp/wikipedia-mcp

Published by github.com/Rudra-ravi