How Oxeye works · methodology from Oxeye Technologies LLC

The methodology, with none of the marketing.

Oxeye — the flagship iOS product of Oxeye Technologies LLC — is not a black box. This page describes the actual data we use, how we train, how we validate, and which numbers we publish. If you have questions or want to replicate a result, get in touch.

1. Data ingestion

We ingest three classes of data, all of which are licensed or publicly available:

  • Prices — end-of-day OHLCV for every US-listed stock, fetched daily from Financial Modeling Prep. We use FMP's bulk end-of-day endpoint, which lets us pull the entire universe in a single API call roughly every ten seconds.
  • Fundamentals — company profiles, ratios, and financial statements, also from FMP. Refreshed monthly on the same schedule as the ML retrain.
  • Economic indicators — Federal Funds rate, CPI, unemployment rate, consumer sentiment, and 2Y/10Y Treasury yields from FRED. Refreshed monthly.

Prices and fundamentals are written to Google Cloud Storage as Parquet files with Hive-style partitioning. BigQuery reads them via external tables, so there's no separate copy to keep in sync.

2. ARIMA_PLUS_XREG price forecasts

For every ticker that clears our quality filters, we train an ARIMA_PLUS_XREG model in BigQuery ML. "XREG" means we include external regressors — in our case, the four economic indicators above. This lets the model learn, for example, that a particular stock's residuals correlate with Fed Funds changes or unemployment.

Models are retrained on the first of every month. A retrain for the full universe takes roughly 45 minutes and runs unattended on a Cloud Workflow. The output is a 90-day forecast with upper and lower confidence bounds, stored in a BigQuery view named v_stock_forecasts.

3. The composite score

A forecast on its own is not enough. Two stocks can have identical predicted 90-day moves and very different risk. We rank them with a composite score that blends the forecast with 24 fundamental metrics. Importantly, we learned the weights instead of hand-picking them — our hand-picked weights turned out to be inversely correlated with realized returns.

Training data

30,708 historical samples across two years of daily data. Each sample pairs a point-in-time snapshot of a stock's fundamentals with the actual 90-day forward return that followed it. This is the target we regress against.

Models evaluated

We evaluated Ridge regression, Lasso, ElasticNet, RandomForest, and GradientBoosting using a five-fold TimeSeriesSplit (so we never train on the future). GradientBoosting had the best raw R² (0.1431) but we ship Ridge coefficients to production because they're interpretable and stable.

Top-weighted features (Ridge)

FeatureCoefficientDirectionReading
payoutRatio−5.87NegativeHigh payout = worse forward returns
dividendYield+5.12PositiveHigher yield = better forward returns
log(marketCap)+1.92PositiveLarger cap = stability premium
returnOnCapitalEmployed+1.84PositiveBetter capital efficiency = better returns
pe_vs_sector_avg−1.83NegativeHigh relative P/E = overvalued
operatingProfitMargin−1.45NegativeAlready-high margins = less upside

All features are bounded and centered before scoring, so a pathological input (say, a payoutRatio of 4,000%) can't flip the result. The full SQL is in bigquery_sql/05_fundamentals.sql in our source repository.

4. Validating the score

We care about one thing: when we rank stocks by composite score, do the higher-ranked ones actually earn more than the lower-ranked ones? To answer it we bucket stocks into quintiles by score and report the mean realized 90-day return per bucket (out-of-fold).

QuintileAvg 90-day returnWin rate
Q5 (highest score)+10.76%64%
Q4+8.45%61%
Q3 (median)+6.89%58%
Q2+5.12%56%
Q1 (lowest)+3.06%54%

Monotonically increasing. Q5 beats Q1 by +7.70 percentage points, and the win rate rises with the score. A scoring change is not allowed to ship unless it maintains or widens this spread on a fresh time-split.

5. Quality tiers

The raw score is continuous. To make it legible in the app we bucket stocks into five tiers — Strong Buy, Buy, Neutral, Sell, Strong Sell — using the combination of composite score and each ticker's individual back-tested accuracy. The Home tab only surfaces Strong Buy and Buy; every other tier is reachable via Search with filters.

6. The LLM outlook

For per-stock outlooks, we use Gemini 2.5 Flash. It runs against a structured prompt that includes:

  • Our ML prediction and the back-tested accuracy for this specific ticker.
  • The latest financial statements and upcoming earnings date, from FMP.
  • Live market indicators: VIX, S&P 500, Nasdaq, 2Y/10Y Treasury yields, yield-curve status, and sector performance.

The model returns a summary, a bull case, a bear case, a list of catalysts, and a confidence level that we adjust for VIX and yield-curve inversion. Outlooks are cached in Firestore and invalidated when the ML retrains or the user requests a refresh.

We never pass personal data into the LLM. Prompts only contain publicly available market data and the stock's own numbers.

7. ETF money flow

The Insights tab answers a different question: where is capital moving this week? We track approximately 4,673 US-listed ETFs and compute the week-over-week delta in each fund's AUM (FMP's marketCap field for ETFs). We then weight each delta by that ETF's sector and country exposures (we pull weightings for the top 500 funds monthly) and aggregate.

sector_flow  = Σ (Δ AUMetf × sector_weight_pct / 100)
country_flow = Σ (Δ AUMetf × country_weight_pct / 100)

Results are published at the sector, country, sector×country, and individual-ETF level, plus a "top movers" list and a pipeline-generated AI summary of the week.

8. Serving

Everything above runs in the pipeline. The app does not talk to BigQuery. Instead, the nightly Cloud Run job (stock-firestore-sync) pushes the latest rankings, forecasts, and per-ticker histories into Firestore. Our FastAPI backend reads from Firestore and returns results in the ~50 ms range. Everything the iOS app uses is under the /fast/* endpoint prefix.

9. Security and integrity

  • Firebase App Check verifies that every API request comes from the genuine iOS app (App Attest) before the backend will answer.
  • Sign in with Apple and Google are implemented via Firebase Authentication. We never see or store passwords.
  • All backend traffic uses HTTPS. Secrets are stored in Google Secret Manager, not in source.
  • The entire pipeline runs on Google Cloud Platform in US regions, with BigQuery, Cloud Storage, Firestore, Cloud Run, and Cloud Workflows.

10. What we can't do (and won't pretend to)

Oxeye is a rating system. It does not account for your tax situation, your risk tolerance, your liquidity needs, news that breaks intraday, or any piece of private information. It can be — and sometimes is — wrong on individual names. The Q5-beats-Q1 spread is a statistical result over a large basket, not a guarantee about any single stock.

If you would like to dig into a specific methodology question or see raw back-test data, write to sridhar@oxeye.tech.