User Guide
This guide covers everything you need to know about using Bearing to find, validate, and compare AI models.
Getting a recommendation
This is the main way most people use Bearing. You describe what you want to do, and Bearing tells you which model is the best fit.
Step 1: Describe your task
On the home page, type a description of what you want to use AI for. Be specific — "summarise long research papers into bullet points" works better than "help me with documents."
Step 2: Answer any follow-up questions
If Bearing needs more context, you'll see a few quick questions with tappable options. Things like "Is this a one-off task or something you'll do regularly?" or "Does this involve images?" Answer them and tap Continue.
If your description is clear enough, you'll skip this step entirely.
Step 3: Rank your priorities
You'll see seven factors, each representing something that might matter to you:
- Quality — best possible output for your task
- Capability — specific features like vision, code generation, or long documents
- Cost — keeping spend low
- Transparency — open weights, published training data, methodology
- Privacy — how your data is handled and retained
- Sustainability — energy use and environmental footprint
- Speed — how fast responses come back
Drag them into the order that matters to you. The factor at the top has the most influence on your results. Tap the up/down arrows if you prefer not to drag.
Step 4: Review your results
You'll see a ranked list of models, best fit first. Each model card shows:
- A match score (percentage) showing how well it fits your task and priorities
- A one-sentence explanation of why it ranked where it did
- Per-factor scores so you can see exactly where each model is strong or weak
- An estimated cost for your specific task type
The top recommendation is highlighted, but you can choose any model from the list.
Pipeline recommendations
If your task involves multiple steps — like extracting text from a PDF and then summarising it — Bearing may suggest a pipeline alternative below the main results. A pipeline recommends a different specialist model for each stage of your task, often resulting in better quality or lower cost than using a single model for everything.
Each stage shows a recommended model and one alternative. The footer compares the total pipeline cost against the top single-model recommendation so you can see the trade-off at a glance.
Not every task gets a pipeline suggestion — only tasks where splitting the work across models would genuinely help.
Step 5: Select a model
Tap Use this one on the model you want to try. This records your choice (anonymously) and helps improve future recommendations.
After selecting, you'll see a link to give feedback later — bookmark it if you want to come back after trying the model.
Validating your current model
Already using a model? Check if it's the best fit for what you're doing.
How it works
- Go to the Validate tab on the home page, or visit the Validate page directly
- Start typing your model's name — a dropdown will show matching models from the registry
- Describe what you use the model for
- Tap Check my model
What you'll see
Bearing will show one of three assessments:
- Good fit — your model is a strong choice for this task. You'll see why, with a scoring breakdown.
- Overpaying — your model works fine, but you could get similar results for less. You'll see the cheaper alternative.
- Better options exist — for this task type, other models would likely perform better. You'll see the top alternatives and where your current model ranks.
In all cases, the full ranked list is shown below so you can explore.
Comparing two models
For tasks where specs alone don't tell the full story — creative writing, nuanced analysis, tone-sensitive work — you can run a head-to-head comparison.
Requirements
- You need to sign in with your email (magic link, no password)
- You get 2 comparisons per day (to manage API costs)
How it works
- From your recommendation results, tap Compare two models head-to-head
- Select exactly two models from your ranked list
- Write or edit a prompt for your task
- Optionally attach a file (PDF or CSV, up to 5MB) — both models will process the same document alongside your prompt
- Tap Run comparison — both models receive the same prompt and file
- Read both responses side by side (model names are visible — this isn't blind testing)
- Vote: Model A, Model B, or About the same
- Optionally explain why
Attaching files
When you attach a file, models that support vision (marked with a Vision badge) receive the raw document. Text-only models receive the extracted text content instead. This means you can compare how a vision model handles a PDF layout versus how a text model handles the same content as plain text.
Your preference is recorded as pairwise data — the same format used by research benchmarks like Chatbot Arena, but anchored to your specific task type and priorities.
Browsing the model registry
Visit the Models page to explore all 29 models in the registry.
Filtering
- Search — type a model name, provider, or slug
- Provider filters — tap a provider name to see only their models
- Capability filters — tap Vision, Code, Tools, Long context, Reasoning, Audio, or Video to filter by capability
Tap any filter again to remove it, or use Clear filters to reset.
Model detail pages
Click any model card to see its full profile:
- Capabilities, strengths, and weaknesses
- Pricing (per million tokens, input and output)
- Transparency scores (open weights, training data, methodology, licence, provider disclosure)
- Sustainability data (inference energy, training footprint, provider infrastructure)
- Task fitness bars showing how well the model performs across different task types
Want to understand how these scores are calculated? See How We Rate Models for the full methodology, research sources, and decisions behind every rating.
Giving feedback
After trying a model, come back to give feedback using the bookmarkable link from your results page.
- Worked well — great, this helps confirm the recommendation
- Not great — select what went wrong: too slow, poor quality, too expensive, couldn't do what you needed, or other
- Add an optional comment
Feedback is the most valuable signal for improving recommendations. Even a thumbs up helps.
The public dataset
All anonymised data is available for download on the Data page.
- Recommendation data — task types, priorities, which model was recommended and chosen, outcomes
- Comparison data — which model was preferred in head-to-head tests
Available in JSON and CSV formats. Never includes raw descriptions, prompts, email addresses, or anything that could identify you.
Managing models (admin)
If you have admin access, you can add, edit, and deactivate models directly from the browser.
Accessing the admin panel
Visit /admin while signed in with an admin account. You'll see a table listing every model in the registry with its name, provider, tier, speed score, and pricing.
Editing a model
Click Edit next to any model to open the edit form. The form is organised into sections:
- Basic info — name, provider, and tier category
- Pricing — input and output cost per million tokens
- Performance — context window size, speed score, and privacy score (sliders from 0 to 1)
- Capabilities — toggle which capabilities the model supports (vision, code, tools, etc.)
- Task fitness — adjust how well the model performs across different task types using sliders
- Transparency — sub-scores for open weights, training data, methodology, licence, and provider disclosure
- Sustainability — inference energy, training footprint, and provider infrastructure scores
- Strengths and weaknesses — editable lists of plain-text descriptions
Click Save Model when you're done. Changes appear immediately on the Models page. To update the recommendation engine's snapshot, run the registry generation step during the next deployment.
Adding a new model
Click Add Model on the admin page to open a blank form. Choose a URL-safe slug (e.g. my-new-model) — this cannot be changed after creation.
Deactivating a model
Deactivated models are hidden from the registry and recommendations but preserved in the database for historical data integrity. Deactivation is available through the admin server actions.
Usage and Insights dashboard
The admin panel includes Usage and Insights tabs alongside the model list.
Usage shows: - Total tasks, users, selections, and comparisons - Activity over time (tasks and selections per day, week, or month) - Mode breakdown (Recommend, Validate, Compare) - User signups over time
Insights shows: - Outcome success rate and average selected rank - Task type distribution — what people are using AI for - Model leaderboard — which models are recommended and selected most - Outcome breakdown — success vs failure with failure reasons - Capability demand — how often tasks need vision, tools, or code
Use the toggle in the top-right to switch between daily, weekly, and monthly views.
Discovering new models
The Discover tab shows AI models available on OpenRouter that aren't yet in the Bearing registry. You can search by name or provider, then import individual models.
When you import a model: 1. Click Import — a form opens with the model's specs from OpenRouter (name, pricing, context window, capabilities) 2. Click Generate Estimates — Claude Haiku suggests initial scores for task fitness, transparency, sustainability, and other factors based on the model's metadata 3. Review and adjust the estimates — all fields are editable 4. Click Save as Draft — the model is added to the registry but not yet visible to users
To make a draft model live, go to the Models tab, find the model, and edit it to set it as active.
Syncing pricing
Click Sync Pricing on the Discover tab to update pricing for all models from OpenRouter's latest data. You'll see a summary of how many models were updated.
Privacy
Bearing does not store your task descriptions or comparison prompts. What we store:
- A hash of your description (for deduplication, not reversible)
- Classified task attributes (type, complexity, input length)
- Your priority ranking
- Which model you chose and at what rank
- Your outcome feedback (if you give it)
- Your comparison preferences (if you compare models)
For sign-in, we store your email address. Nothing else about you.