stock-bot/apps/data-service/src/PROXY-SERVICE-README.md

7 KiB

Proxy Service

A comprehensive proxy management service for the Stock Bot platform that integrates with existing libraries (Redis cache, logger, http-client) to provide robust proxy scraping, validation, and management capabilities.

Features

  • Automatic Proxy Scraping: Scrapes free proxies from multiple public sources
  • Proxy Validation: Tests proxy connectivity and response times
  • Redis Caching: Stores proxy data with TTL and working status in Redis
  • Health Monitoring: Periodic health checks for working proxies
  • Structured Logging: Comprehensive logging with the platform's logger
  • HTTP Client Integration: Seamless integration with the existing http-client library
  • Background Processing: Non-blocking proxy validation and refresh jobs

Quick Start

import { proxyService } from './services/proxy.service.js';

// Start the proxy service with automatic refresh
await proxyService.queueRefreshProxies(30 * 60 * 1000); // Refresh every 30 minutes
await proxyService.startHealthChecks(15 * 60 * 1000);   // Health check every 15 minutes

// Get a working proxy
const proxy = await proxyService.getWorkingProxy();

// Use the proxy with HttpClient
import { HttpClient } from '@stock-bot/http-client';
const client = new HttpClient({ proxy });
const response = await client.get('https://api.example.com/data');

Core Methods

Proxy Management

// Scrape proxies from default sources
const count = await proxyService.scrapeProxies();

// Scrape from custom sources
const customSources = [
  {
    url: 'https://example.com/proxy-list.txt',
    type: 'free',
    format: 'text',
    parser: (content) => parseCustomFormat(content)
  }
];
await proxyService.scrapeProxies(customSources);

// Test a specific proxy
const result = await proxyService.checkProxy(proxy, 'http://httpbin.org/ip');
console.log(`Proxy working: ${result.isWorking}, Response time: ${result.responseTime}ms`);

Proxy Retrieval

// Get a single working proxy
const proxy = await proxyService.getWorkingProxy();

// Get multiple working proxies
const proxies = await proxyService.getWorkingProxies(10);

// Get all proxies (including non-working)
const allProxies = await proxyService.getAllProxies();

Statistics and Monitoring

// Get proxy statistics
const stats = await proxyService.getProxyStats();
console.log(`Total: ${stats.total}, Working: ${stats.working}, Failed: ${stats.failed}`);
console.log(`Average response time: ${stats.avgResponseTime}ms`);

Maintenance

// Clear all proxy data
await proxyService.clearProxies();

// Graceful shutdown
await proxyService.shutdown();

Configuration

The service uses environment variables for Redis configuration:

REDIS_HOST=localhost       # Redis host (default: localhost)
REDIS_PORT=6379           # Redis port (default: 6379)
REDIS_DB=0                # Redis database (default: 0)

Proxy Sources

Default sources include:

  • TheSpeedX/PROXY-List (HTTP proxies)
  • clarketm/proxy-list (HTTP proxies)
  • ShiftyTR/Proxy-List (HTTP proxies)
  • monosans/proxy-list (HTTP proxies)

Custom Proxy Sources

You can add custom proxy sources with different formats:

const customSource = {
  url: 'https://api.example.com/proxies',
  type: 'premium',
  format: 'json',
  parser: (content) => {
    const data = JSON.parse(content);
    return data.proxies.map(p => ({
      type: 'http',
      host: p.ip,
      port: p.port,
      username: p.user,
      password: p.pass
    }));
  }
};

Integration Examples

With Market Data Collection

import { proxyService } from './services/proxy.service.js';
import { HttpClient } from '@stock-bot/http-client';

async function fetchMarketDataWithProxy(symbol: string) {
  const proxy = await proxyService.getWorkingProxy();
  if (!proxy) {
    throw new Error('No working proxies available');
  }

  const client = new HttpClient({ 
    proxy,
    timeout: 10000,
    retries: 2
  });

  try {
    return await client.get(`https://api.example.com/stock/${symbol}`);
  } catch (error) {
    // Mark proxy as potentially failed and try another
    await proxyService.checkProxy(proxy);
    throw error;
  }
}

Proxy Rotation Strategy

async function fetchWithProxyRotation(urls: string[]) {
  const proxies = await proxyService.getWorkingProxies(urls.length);
  
  const promises = urls.map(async (url, index) => {
    const proxy = proxies[index % proxies.length];
    const client = new HttpClient({ proxy });
    return client.get(url);
  });

  return Promise.allSettled(promises);
}

Cache Structure

The service stores data in Redis with the following structure:

proxy:{host}:{port}          # Individual proxy data with status
proxy:working:{host}:{port}  # Working proxy references
proxy:stats                  # Cached statistics

Logging

The service provides structured logging for all operations:

  • Proxy scraping progress and results
  • Validation results and timing
  • Cache operations and statistics
  • Error conditions and recovery

Background Jobs

Refresh Job

  • Scrapes proxies from all sources
  • Removes duplicates
  • Stores in cache with metadata
  • Triggers background validation

Health Check Job

  • Tests existing working proxies
  • Updates status in cache
  • Removes failed proxies from working set
  • Maintains proxy pool health

Validation Job

  • Tests newly scraped proxies
  • Updates working status
  • Measures response times
  • Runs in background to avoid blocking

Error Handling

The service includes comprehensive error handling:

  • Network failures during scraping
  • Redis connection issues
  • Proxy validation timeouts
  • Invalid proxy formats
  • Cache operation failures

All errors are logged with context and don't crash the service.

Performance Considerations

  • Concurrent Validation: Processes proxies in chunks of 50
  • Rate Limiting: Includes delays between validation chunks
  • Cache Efficiency: Uses TTL and working proxy sets
  • Memory Management: Processes large proxy lists in batches
  • Background Processing: Validation doesn't block main operations

Dependencies

  • @stock-bot/cache: Redis caching with TTL support
  • @stock-bot/logger: Structured logging with Loki integration
  • @stock-bot/http-client: HTTP client with built-in proxy support
  • ioredis: Redis client (via cache library)
  • pino: High-performance logging (via logger library)

Limitations

Due to the current Redis cache provider interface:

  • Key pattern matching not available
  • Bulk operations limited
  • Set operations (sadd, srem) not directly supported

The service works around these limitations using individual key operations and maintains functionality while noting areas for future enhancement.

Future Enhancements

  • Premium proxy source integration
  • Proxy performance analytics
  • Geographic proxy distribution
  • Protocol-specific proxy pools (HTTP, HTTPS, SOCKS)
  • Enhanced caching with set operations
  • Proxy authentication management