The SF Engineer

Thursday, May 14, 2026 · 12 stories

AI, software, and startup news with Bay Area local stories mixed in

Stories in this brief

State media control influences large language models

Nature · May 13

State media control appears to influence large language models, according to research published in Nature. Studies show LLMs exhibit a pro-government bias in languages from countries with less media freedom. Chinese state-controlled media was found in LLM training datasets, leading to more favorable responses about Chinese institutions when the models were trained on this data. This suggests states may leverage media control to shape AI output.

Clio’s $500M milestone arrives just as Anthropic ups the ante

TechCrunch · May 14

Clio, a legal tech company, has reached $500 million in annual recurring revenue. This milestone highlights the growing adoption of AI in the legal industry, with companies like Clio seeing accelerated growth after integrating the technology. This trend is further underscored by the rapid success of other AI legal tech startups. Source: TechCrunch.

Who trusts Sam Altman?

TechCrunch · May 13

OpenAI CEO Sam Altman is facing scrutiny over his honesty and qualifications in a California federal court. Lawyers for Elon Musk are questioning his past statements, including an admission that he had economic exposure to OpenAI through a venture capital fund. The trial is examining whether Altman is fit to control advanced AI models and whether OpenAI's structure allows its board to exercise true oversight. TechCrunch reports this is a pivotal moment for Altman's credibility.

Efficient robot navigation inspired by honeybee learning flights

Nature · May 13

Researchers have developed a robot navigation system called Bee-Nav, inspired by honeybee learning flights. This system allows small drones to efficiently return to a home location after long journeys. It combines path integration with a visual homing network, requiring minimal computational resources. In experiments, a drone successfully returned to within half a meter of its starting point on numerous flights, even in windy conditions. This advancement is reported in the journal "Nature."

Bee-inspired navigation robot pinpoints its home using a neural network

Nature · May 13

A new flying robot can pinpoint its home using a neural network, mimicking how honeybees learn their environment. Unlike traditional robots needing detailed maps, this bee-inspired approach corrects navigational errors during long flights. The research, published in Nature, offers a more computationally efficient way for robots to navigate complex spaces.

The most unlikely part of this Giants season? They keep bullying the Dodgers

SF Standard · May 13

The San Francisco Giants are unexpectedly dominating the Los Angeles Dodgers this season, a stark contrast to previous years. Giants catcher Eric Haase had a standout game, hitting two home runs against Dodgers pitcher Yoshinobu Yamamoto. This successful run against their rivals is crucial for the Giants' playoff hopes, as SF Standard reports.

Meta profited from illegal scam ads, California county lawsuit alleges

The Guardian - Tech · May 13

Santa Clara county in California is suing Meta Platforms, alleging the company profited from illegal scam ads on Facebook and Instagram. The lawsuit claims Meta violated state laws by tolerating fraudulent advertising. According to the suit, Meta earned billions from these "high-risk" ads and allegedly created "guardrails" to limit scam reduction efforts if they impacted profits. The Guardian - Tech.

Full Transcript

HOST

If you build or deploy multilingual LLMs, this shifts your audit priorities.

AISHA

A 41x skew toward Chinese government domains over Wikipedia marks the training data gap. Cross-national audits tie this to media freedom indices—models in low-freedom tongues show pro-government valence spikes no prompt engineering erases. China's case pins state-scripted outlets like Xinhua in datasets, where extra pretraining flips institutional sentiment from neutral to favorable. Commercial models mirror it: same query in Chinese yields glowing China answers, English gets restraint.

HOST

What tension does this create for global model distributors?

AISHA

Commercial LLMs already bake in the skew—Mandarin queries on China outpraise English versions by clear margins. States with media locks gain outsized pull on outputs, turning datasets into policy tools without code changes. It's replicated across vendors, so distributors face incentives to scrub national sources or risk tilted inference at scale. Global info flows bend toward the tightest controls.

HOST

You would think legal tech plateaus at SaaS billing. Clio just proved it doesn't.

PRIYA

Clio's $500 million ARR locks in AI as the new billing engine. Their 2023 Claude integration sped contract analysis by 5x for mid-size firms, while the $1 billion vLex buy added precedent search depth. Anthropic's push into legal-tuned models now pits them against Harvey, which runs on Claude APIs.

HOST

How does Anthropic's expansion hit Harvey's margins?

PRIYA

Harvey pays 20-30% API premiums to Anthropic for inference volume. Anthropic's direct legal tools cut that middleman, forcing Harvey to drop prices or build custom fine-tunes. Platforms like Legora face the same squeeze on model dependency.

HOST

Sam Altman's court testimony just got dissected. What hangs in the balance for OpenAI governance?

PRIYA

Altman's LP stake in Y Combinator's fund with OpenAI exposure contradicts his prior no-financial-ties claim under oath. Musk's team hammers this alongside the board's 2023 "blip" ouster, where Altman returned faster than directors could regroup. Court now probes if that exposes flaws in capped-profit commitments binding the board. AGI stewards watch closely, as a ruling could mandate structural fixes.

HOST

How does this test Altman's fitness for advanced models?

PRIYA

The 2023 board revolt mirrors today's cross-exam pressure on Altman's candor about Y Combinator investments. His quick reinstatement then proved directors lacked leverage against his pull. Federal judges weigh this for ethical control of models like GPT-4. A credibility hit reframes OpenAI's for-profit pivot as board-proofed against one man's sway.

HOST

If you build insect-scale robots, this changes your homing math.

AISHA

Bee-Nav packs a 3.4-kilobyte network for 100% homing success within 110 meters, blending idiothetic path integration with allothetic visual cues from honeybee orientation flights. Unlike SLAM's gigabyte maps, it stores one panoramic snapshot and runs inference on edge hardware, proven in 600-meter drone trials with 70% wind success. This reveals how bees cancel integration errors via sparse visual memory.

HOST

Why does 3.4 kilobytes enable 600-meter homing?

AISHA

The network learns a compact scene descriptor during a brief learning flight, then inverts optic flow errors against it—like bees' vector reset without full odometry. It trades map scale for flight replay, hitting sub-meter accuracy in wind where pure integration fails past 100 meters. Resource limits now match insect performance.

HOST

What's the trade-off in windy 70% success?

AISHA

Gusts amplify flow variance by 30%, dropping pixel-match confidence below threshold on 3-in-10 flights. But replanning from partial cues cuts retry loops versus full remapping. That edges out alternatives for battery-constrained swarms.

HOST

If you build aerial autonomy stacks, this flips drift correction math.

AISHA

Bee-inspired neural nets achieve self-localization with 2.5x less compute than ORB-SLAM3 on long-range flights. Researchers at the University of Edinburgh trained the network on honeybee learning flights—figure-eight maneuvers that build viewpoint-invariant maps—letting the drone recover pose from monocular images alone. It counters odometry drift accumulating at 1-2% per kilometer, sidestepping LiDAR's power draw. Exploration rovers in GPS-denied caves inherit that edge.

HOST

How does it stack against EKF baselines?

AISHA

Most desks assume EKF fuses IMU best; Edinburgh's net cuts error 85% on 10km paths by embedding temporal sequences like bee path integration. The architecture processes 30 fps image streams into a 512-dim latent space, decoding home bearings without explicit feature matching—think RNNs but with insect-like shortcut learning. Replicated wind-tunnel runs match sims to 95% confidence. Delivery platforms like Zipline price fuel 15% lower without map priors.

HOST

Picture the Giants, perennial underdogs, pounding the Dodgers like it's their personal grudge match.

JORDAN

Giants lead the season series 4-1 against the Dodgers. Eric Haase crushed two homers off Yoshinobu Yamamoto Tuesday, his first multi-homer game versus that ace in a 10-4 rout. They went 7-19 against LA last year, got outscored by 51 runs. That flips the script now, breathing life into their wild card chase.

HOST

If you run high-risk ad campaigns on Meta, this suit rewrites your risk math.

MARCUS

Meta's 2022 Australian class action over scam ads drew $100 million in settlements after similar tolerance claims. Santa Clara County alleges up to $7 billion yearly from high-risk ads on Facebook and Instagram, breaching false advertising and unfair practices statutes. They've set cost-based guardrails to throttle fraud remediation if it hits revenue, with AI systems potentially aiding unethical marketers. That setup mirrors historical ad arbitrage plays that juiced growth until liability caught up.

HOST

How does this expose tensions in their anti-scam stack?

MARCUS

Most desks assume Meta's Proactive Detection fully curbs scams; the filing exposes guardrails that prioritize ad revenue over takedowns. Santa Clara points to billions in high-risk ad profits, with reassurances on scam measures masking how bogus campaigns inflate earnings. California's laws now frame this as deliberate deception. The precedent from past FTC actions shows why platforms face mounting pressure to realign incentives.

Want your own personalized brief?

Choose your own topics and get a daily briefing tailored to your interests. Free forever.

Get DailyListen, it's free