finished off spider search

This commit is contained in:
Boki 2025-06-17 20:38:12 -04:00
parent 0cf0b315df
commit 95eda4a842
4 changed files with 364 additions and 40 deletions

2
.env
View file

@ -11,7 +11,7 @@ DATA_SERVICE_PORT=2001
# Queue and Worker Configuration
WORKER_COUNT=1
WORKER_CONCURRENCY=4
WORKER_CONCURRENCY=2
WEBSHARE_API_KEY=y8ay534rcbybdkk3evnzmt640xxfhy7252ce2t98
WEBSHARE_ROTATING_PROXY_URL=http://doimvbnb-rotate:w5fpiwrb9895@p.webshare.io:80/

152
CLAUDE.md Normal file
View file

@ -0,0 +1,152 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Commands
**Package Manager**: Bun (v1.1.0+)
**Build & Development**:
- `bun install` - Install dependencies
- `bun run dev` - Start all services in development mode (uses Turbo)
- `bun run build` - Build all services and libraries
- `bun run build:libs` - Build only shared libraries
- `./scripts/build-all.sh` - Custom build script with options
**Testing**:
- `bun test` - Run all tests
- `bun run test:libs` - Test only shared libraries
- `bun run test:apps` - Test only applications
- `bun run test:coverage` - Run tests with coverage
**Code Quality**:
- `bun run lint` - Lint TypeScript files
- `bun run lint:fix` - Auto-fix linting issues
- `bun run format` - Format code using Prettier
- `./scripts/format.sh` - Format script
**Infrastructure**:
- `bun run infra:up` - Start database infrastructure (PostgreSQL, QuestDB, MongoDB, Dragonfly)
- `bun run infra:down` - Stop infrastructure
- `bun run infra:reset` - Reset infrastructure with clean volumes
- `bun run docker:admin` - Start admin GUIs (pgAdmin, Mongo Express, Redis Insight)
**Database Setup**:
- `bun run db:setup-ib` - Setup Interactive Brokers database schema
- `bun run db:init` - Initialize database schemas
## Architecture Overview
**Microservices Architecture** with shared libraries and multi-database storage:
### Core Services (`apps/`)
- **data-service** - Market data ingestion from multiple providers (Yahoo, QuoteMedia, IB)
- **processing-service** - Data cleaning, validation, and technical indicators
- **strategy-service** - Trading strategies and backtesting (multi-mode: live, event-driven, vectorized, hybrid)
- **execution-service** - Order management and risk controls
- **portfolio-service** - Position tracking and performance analytics
- **web** - React dashboard with real-time updates
### Shared Libraries (`libs/`)
- **config** - Environment configuration with Zod validation
- **logger** - Loki-integrated structured logging (use `getLogger()` pattern)
- **http** - HTTP client with proxy support and rate limiting
- **cache** - Redis/Dragonfly caching layer
- **queue** - BullMQ-based job processing with batch support
- **postgres-client** - PostgreSQL operations with transactions
- **questdb-client** - Time-series data storage
- **mongodb-client** - Document storage operations
- **utils** - Financial calculations and technical indicators
### Database Strategy
- **PostgreSQL** - Transactional data (orders, positions, strategies)
- **QuestDB** - Time-series data (OHLCV, indicators, performance metrics)
- **MongoDB** - Document storage (configurations, raw responses)
- **Dragonfly** - Event bus and caching (Redis-compatible)
## Key Patterns & Conventions
**Library Usage**:
- Import from shared libraries: `import { getLogger } from '@stock-bot/logger'`
- Use configuration: `import { databaseConfig } from '@stock-bot/config'`
- Logger pattern: `const logger = getLogger('service-name')`
**Service Structure**:
- Each service has `src/index.ts` as entry point
- Routes in `src/routes/` using Hono framework
- Providers/services in `src/providers/` or `src/services/`
- Use dependency injection pattern
**Data Processing**:
- Raw data → QuestDB via providers
- Processed data → PostgreSQL via processing service
- Event-driven communication via Dragonfly
- Queue-based batch processing for large datasets
**Multi-Mode Backtesting**:
- **Live Mode** - Real-time trading with brokers
- **Event-Driven** - Realistic simulation with market conditions
- **Vectorized** - Fast mathematical backtesting for optimization
- **Hybrid** - Validation by comparing vectorized vs event-driven results
## Development Workflow
1. **Start Infrastructure**: `bun run infra:up`
2. **Build Libraries**: `bun run build:libs`
3. **Start Development**: `bun run dev`
4. **Access UIs**:
- Dashboard: http://localhost:4200
- QuestDB Console: http://localhost:9000
- Grafana: http://localhost:3000
- pgAdmin: http://localhost:8080
## Important Files & Locations
**Configuration**:
- Environment variables in `.env` files
- Service configs in `libs/config/src/`
- Database init scripts in `database/postgres/init/`
**Key Scripts**:
- `scripts/build-all.sh` - Production build with cleanup
- `scripts/docker.sh` - Docker management
- `scripts/format.sh` - Code formatting
**Documentation**:
- `SIMPLIFIED-ARCHITECTURE.md` - Detailed architecture overview
- `DEVELOPMENT-ROADMAP.md` - Development phases and priorities
- Individual library READMEs in `libs/*/README.md`
## Current Development Phase
**Phase 1: Data Foundation Layer** (In Progress)
- Enhancing data provider reliability and rate limiting
- Implementing data validation and quality metrics
- Optimizing QuestDB storage for time-series data
- Building robust HTTP client with circuit breakers
Focus on data quality and provider fault tolerance before advancing to strategy implementation.
## Testing & Quality
- Use Bun's built-in test runner
- Integration tests with TestContainers for databases
- ESLint for code quality with TypeScript rules
- Prettier for code formatting
- All services should have health check endpoints
## Environment Variables
Key environment variables (see `.env` example):
- `NODE_ENV` - Environment (development/production)
- `DATA_SERVICE_PORT` - Port for data service
- `DRAGONFLY_HOST/PORT` - Cache/event bus connection
- Database connection strings for PostgreSQL, QuestDB, MongoDB
## Monitoring & Observability
- **Logging**: Structured JSON logs to Loki
- **Metrics**: Prometheus metrics collection
- **Visualization**: Grafana dashboards
- **Queue Monitoring**: Bull Board for job queues
- **Health Checks**: All services expose `/health` endpoints

View file

@ -1,5 +1,6 @@
import { getLogger } from '@stock-bot/logger';
import { providerRegistry, type ProviderConfigWithSchedule } from '@stock-bot/queue';
import type { SymbolSpiderJob } from './qm.tasks';
const logger = getLogger('qm-provider');
@ -39,11 +40,25 @@ export function initializeQMProvider() {
};
}
},
'spider-symbol-search': async (payload: SymbolSpiderJob) => {
logger.info('Processing spider symbol search job', { payload });
const { spiderSymbolSearch } = await import('./qm.tasks');
const result = await spiderSymbolSearch(payload);
logger.info('Spider search job completed', {
success: result.success,
symbolsFound: result.symbolsFound,
jobsCreated: result.jobsCreated,
payload,
});
return result;
},
},
scheduledJobs: [
{
type: 'create-sessions',
type: 'session-management',
operation: 'create-sessions',
payload: {},
cronPattern: '*/15 * * * * *', // Every minute
@ -52,13 +67,11 @@ export function initializeQMProvider() {
description: 'Create and maintain QM sessions',
},
{
type: 'search-symbols',
operation: 'search-symbols',
type: 'qm-maintnance',
operation: 'spider-symbol-search',
payload: {},
cronPattern: '0 0 * * 0', // Every minute
priority: 10,
immediately: true,
delay: 100000, // Delay to allow sessions to be ready
description: 'Comprehensive symbol search using QM API',
},
],

View file

@ -1,7 +1,6 @@
import { getRandomUserAgent } from '@stock-bot/http';
import { getLogger } from '@stock-bot/logger';
import { getMongoDBClient } from '@stock-bot/mongodb-client';
import { SymbolSearchUtil } from '../utils/symbol-search.util';
import { getProxy } from './webshare.provider';
// Shared instances (module-scoped, not global)
@ -17,6 +16,21 @@ export interface QMSession {
lastUsed: Date;
}
export interface SymbolSpiderJob {
prefix: string | null; // null = root job (A-Z)
depth: number; // 1=A, 2=AA, 3=AAA, etc.
source: string; // 'qm'
maxDepth?: number; // optional max depth limit
}
interface Exchange {
exchange: string;
exchangeCode: string;
exchangeShortName: string;
countryCode: string;
source: string;
}
function getQmHeaders(): Record<string, string> {
return {
'User-Agent': getRandomUserAgent(),
@ -112,6 +126,7 @@ export async function createSessions(): Promise<void> {
logger.info('QM session created successfully', {
sessionId,
sessionData,
proxy: newSession.proxy,
sessionCount: sessionCache[sessionId].length + 1,
});
newSession.headers['Datatool-Token'] = sessionData.token;
@ -125,6 +140,151 @@ export async function createSessions(): Promise<void> {
}
}
// Spider-based symbol search functions
export async function spiderSymbolSearch(
payload: SymbolSpiderJob
): Promise<{ success: boolean; symbolsFound: number; jobsCreated: number }> {
try {
if (!isInitialized) {
await initializeQMResources();
}
const { prefix, depth, source = 'qm', maxDepth = 4 } = payload;
logger.info(`Starting spider search`, { prefix: prefix || 'ROOT', depth, source });
// Root job: Create A-Z jobs
if (prefix === null || prefix === undefined || prefix === '') {
return await createAlphabetJobs(source, maxDepth);
}
// Leaf job: Search for symbols with this prefix
return await searchAndSpawnJobs(prefix, depth, source, maxDepth);
} catch (error) {
logger.error('Spider symbol search failed', { error, payload });
return { success: false, symbolsFound: 0, jobsCreated: 0 };
}
}
async function createAlphabetJobs(
source: string,
maxDepth: number
): Promise<{ success: boolean; symbolsFound: number; jobsCreated: number }> {
try {
const { queueManager } = await import('../index');
let jobsCreated = 0;
// Create jobs for A-Z
for (let i = 0; i < 26; i++) {
const letter = String.fromCharCode(65 + i); // A=65, B=66, etc.
const job: SymbolSpiderJob = {
prefix: letter,
depth: 1,
source,
maxDepth,
};
await queueManager.add(
'qm',
{
provider: 'qm',
operation: 'spider-symbol-search',
payload: job,
},
{
priority: 5,
delay: i * 100, // Stagger jobs by 100ms
attempts: 3,
backoff: { type: 'exponential', delay: 2000 },
}
);
jobsCreated++;
}
logger.info(`Created ${jobsCreated} alphabet jobs (A-Z)`);
return { success: true, symbolsFound: 0, jobsCreated };
} catch (error) {
logger.error('Failed to create alphabet jobs', { error });
return { success: false, symbolsFound: 0, jobsCreated: 0 };
}
}
async function searchAndSpawnJobs(
prefix: string,
depth: number,
source: string,
maxDepth: number
): Promise<{ success: boolean; symbolsFound: number; jobsCreated: number }> {
try {
// Ensure sessions exist
const sessionId = 'dc8c9930437f65d30f6597768800957017bac203a0a50342932757c8dfa158d6';
const currentSessions = sessionCache[sessionId] || [];
if (currentSessions.length === 0) {
logger.info('No sessions found, creating sessions first...');
await createSessions();
await new Promise(resolve => setTimeout(resolve, 1000));
}
// Search for symbols with this prefix
const symbols = await searchQMSymbolsAPI(prefix);
const symbolCount = symbols.length;
logger.info(`Prefix "${prefix}" returned ${symbolCount} symbols`);
let jobsCreated = 0;
// If we have 50+ symbols and haven't reached max depth, spawn sub-jobs
if (symbolCount >= 50 && depth < maxDepth) {
const { queueManager } = await import('../index');
logger.info(`Spawning sub-jobs for prefix "${prefix}" (${symbolCount} >= 50 symbols)`);
// Create jobs for prefixA, prefixB, prefixC... prefixZ
for (let i = 0; i < 26; i++) {
const letter = String.fromCharCode(65 + i);
const newPrefix = prefix + letter;
const job: SymbolSpiderJob = {
prefix: newPrefix,
depth: depth + 1,
source,
maxDepth,
};
await queueManager.add(
'qm',
{
provider: 'qm',
operation: 'spider-symbol-search',
payload: job,
},
{
priority: Math.max(1, 6 - depth), // Higher priority for deeper jobs
delay: i * 50, // Stagger sub-jobs by 50ms
attempts: 3,
backoff: { type: 'exponential', delay: 2000 },
}
);
jobsCreated++;
}
logger.info(`Created ${jobsCreated} sub-jobs for prefix "${prefix}"`);
} else {
// Terminal case: save symbols and exchanges (already done in searchQMSymbolsAPI)
logger.info(`Terminal case for prefix "${prefix}": ${symbolCount} symbols saved`);
}
return { success: true, symbolsFound: symbolCount, jobsCreated };
} catch (error) {
logger.error(`Failed to search and spawn jobs for prefix "${prefix}"`, { error, depth });
return { success: false, symbolsFound: 0, jobsCreated: 0 };
}
}
// API call function to search symbols via QM
async function searchQMSymbolsAPI(query: string): Promise<string[]> {
const proxy = getProxy();
@ -154,8 +314,15 @@ async function searchQMSymbolsAPI(query: string): Promise<string[]> {
const symbols = await response.json();
const client = getMongoDBClient();
await client.batchUpsert('qmSymbols', symbols, ['symbol', 'exchange']);
const exchanges: any[] = [];
const updatedSymbols = symbols.map((symbol: any) => {
return {
...symbol,
qmSearchCode: symbol.symbol, // Store original symbol for reference
symbol: symbol.symbol.split(':')[0], // Extract symbol from "symbol:exchange"
};
});
await client.batchUpsert('qmSymbols', updatedSymbols, ['qmSearchCode']);
const exchanges: Exchange[] = [];
for (const symbol of symbols) {
if (!exchanges.some(ex => ex.exchange === symbol.exchange)) {
exchanges.push({
@ -163,6 +330,7 @@ async function searchQMSymbolsAPI(query: string): Promise<string[]> {
exchangeCode: symbol.exchangeCode,
exchangeShortName: symbol.exchangeShortName,
countryCode: symbol.countryCode,
source: 'qm',
});
}
}
@ -170,7 +338,9 @@ async function searchQMSymbolsAPI(query: string): Promise<string[]> {
session.successfulCalls++;
session.lastUsed = new Date();
logger.info(`QM API returned ${symbols.length} symbols for query: ${query}`);
logger.info(
`QM API returned ${symbols.length} symbols for query: ${query} with proxy ${session.proxy}`
);
return symbols;
} catch (error) {
logger.error(`Error searching QM symbols for query "${query}":`, error);
@ -187,42 +357,30 @@ export async function fetchSymbols(): Promise<unknown[] | null> {
if (!isInitialized) {
await initializeQMResources();
}
const sessionId = 'dc8c9930437f65d30f6597768800957017bac203a0a50342932757c8dfa158d6'; // Use the session ID for symbol lookup
const currentSessions = sessionCache[sessionId] || [];
if (currentSessions.length === 0) {
logger.info('No sessions found, creating sessions first...');
await createSessions();
logger.info('🔄 Starting QM spider-based symbol search...');
// Wait a bit for sessions to be ready
await new Promise(resolve => setTimeout(resolve, 2000));
const newSessions = sessionCache[sessionId] || [];
if (newSessions.length === 0) {
throw new Error('Failed to create sessions before symbol search');
}
logger.info(`Created ${newSessions.length} sessions for symbol search`);
}
logger.info('🔄 Starting QM symbols fetch...');
// Create search function that uses our QM API
const searchFunction = async (query: string): Promise<string[]> => {
return await searchQMSymbolsAPI(query);
// Start the spider process with root job
const rootJob: SymbolSpiderJob = {
prefix: null, // Root job creates A-Z jobs
depth: 0,
source: 'qm',
maxDepth: 4,
};
// Use the utility to perform comprehensive search
const symbols = await SymbolSearchUtil.search(
searchFunction,
50, // threshold
4, // max depth (A -> AA -> AAA -> AAAA)
200 // delay between requests in ms
);
const result = await spiderSymbolSearch(rootJob);
logger.info(`QM symbols fetch completed. Found ${symbols.length} total symbols`);
return symbols;
if (result.success) {
logger.info(
`QM spider search initiated successfully. Created ${result.jobsCreated} initial jobs`
);
return [`Spider search initiated with ${result.jobsCreated} jobs`];
} else {
logger.error('Failed to initiate QM spider search');
return null;
}
} catch (error) {
logger.error('❌ Failed to fetch QM symbols', { error });
logger.error('❌ Failed to start QM spider symbol search', { error });
return null;
}
}
@ -246,4 +404,5 @@ export const qmTasks = {
createSessions,
fetchSymbols,
fetchExchanges,
spiderSymbolSearch,
};