Inside OpenAI: The Intense 7-Week Development of Codex and the Culture of Rapid Innovation}

Ex-OpenAI engineer reveals the company's fast-paced environment, including 7 weeks to develop Codex, intense work hours, and a culture of action-driven innovation amid rapid growth and strict security.

Inside OpenAI: The Intense 7-Week Development of Codex and the Culture of Rapid Innovation}

Unveiling the daily life at OpenAI.

OpenAI has always been a focus of media attention, especially after several core employees left, sparking widespread discussions about its internal culture and management style.

Recently, former employee Calvin French-Owen published a reflective article sharing his personal experience working at OpenAI, providing first-hand insights into its operations.

Who is Calvin French-Owen?

Calvin is an experienced entrepreneur and engineer.

According to his LinkedIn profile, Calvin studied computer science at MIT as an undergraduate.

Before graduating, he co-founded Segment, a customer data platform, serving as CTO. Segment was acquired by Twilio for $3.2 billion in 2020. He also briefly worked at Y Combinator.

In May 2024, Calvin joined OpenAI as an engineer, working on the Codex project, OpenAI’s AI programming assistant designed to boost coding efficiency.

After over a year at OpenAI, Calvin left in June this year.

Three weeks after leaving, he wrote a blog post titled “Reflections on OpenAI.”

Regarding his departure, Calvin emphasized there was no personal grudge. He was somewhat conflicted about leaving, given the transition from being an entrepreneur to joining a 3,000-person company. He expressed a desire to start anew but also hinted he might return, as working on AGI and cutting-edge tech at OpenAI is too attractive to pass up.

No email culture, communication relies on Slack

Calvin revealed that OpenAI’s growth was astonishing.

When he joined, the company had just over 1,000 employees, but within a year, it grew to over 3,000. Calvin’s tenure was among the top 30%, and leadership responsibilities had changed dramatically over two or three years.

This rapid expansion brought various issues: internal communication, organizational structure, product releases, personnel management, and recruitment processes.

Different teams had vastly different working styles: some sprint on new projects, others focus on large models, and some maintain a steady pace. Overall, there was no unified work experience at OpenAI because research, application, and marketing teams operated at different rhythms.

Interestingly, internal communication was almost entirely via Slack, with little use of email. Calvin said he received about ten emails in a year. Poor Slack notification settings could be distracting, but well-managed Slack was quite effective.

Promotion based on ability, not politics or speeches

OpenAI’s culture is very “bottom-up.” When Calvin asked about quarterly plans, the answer was often “there is no such thing” (though now there is).

Good ideas can come from anywhere, and no one knew which would succeed. Progress depended on trial and error, discoveries, and pushing boundaries, not on grand blueprints.

This culture values capability. Leadership promotions are based on ideas and execution, not on speeches or office politics. Many top leaders are not good at office politics, but good ideas and results are what matter most.

Act first, ask later — no need for approval

OpenAI emphasizes “action first.” Teams often come up with similar ideas independently. Calvin participated in an internal project similar to ChatGPT Connectors. Before Codex’s release, several prototypes existed, developed spontaneously without approval, as long as the idea showed potential.

Andrey, Codex’s lead, said researchers are like “small CEOs,” encouraged to identify problems and experiment freely. If a problem is deemed “boring” or “solved,” no one will revisit it.

Excellent research managers are crucial but resources are limited. They connect different research efforts and push larger-scale model training. Good product managers (PMs) are equally important.

Calvin gave an example: the engineering managers he worked with—Akshay, Rizzo, Sulman—are very steady, having seen many projects. They mostly manage independently, focusing on recruiting top talent and creating success conditions.

Rapid strategic adjustments and strict confidentiality

OpenAI’s strategy shifts quickly. New information prompts immediate changes, not sticking rigidly to plans. A 3,000-person company can make decisions efficiently, unlike Google. Once a direction is set, they go all-in.

The company is highly secretive. Many internal developments are leaked or reported by media before official announcements. Some Twitter users even run bots to monitor new features.

Therefore, confidentiality is critical. Calvin couldn’t discuss his work openly. Slack has strict access controls, and financial data is highly confidential.

Internal focus on security

OpenAI takes security seriously due to its high responsibility. The goal is to develop AGI, which carries immense pressure. The company serves hundreds of millions of users, including in healthcare and mental health, and competes globally against Meta, Google, and Anthropic. Governments also closely watch AI developments.

Despite media criticism, everyone is working hard to do the right thing. As a consumer-oriented company, OpenAI attracts attention and criticism alike.

Don’t see OpenAI as monolithic. It’s more like the early Los Alamos Laboratory: a group of scientists pioneering frontier research, accidentally creating globally impactful applications, then expanding to government and enterprise markets. Employees’ goals and perspectives vary widely, and over time, many see it as a “research lab” or “public good.”

OpenAI is truly practicing AI for all. Cutting-edge models are not just for big clients; anyone can use ChatGPT, even without logging in. APIs are open to startups, and the most advanced models will soon be available.

Security is more important than you think. Many are dedicated to developing safety systems, focusing on real risks like hate speech, abuse, political manipulation, biological weapons, self-harm, and prompt injection, rather than speculative risks like AI explosion. Some research on these is ongoing, but much is not public.

Driven by Twitter atmosphere

Unlike other companies that hand out branded souvenirs at recruitment events, OpenAI’s merchandise is scarce. New employees receive limited items, often through “limited editions” that once crashed Shopify due to high demand. An internal post even teaches how to bypass restrictions using JSON data.

GPU costs are the main expense. For example, the GPU cost for a niche Codex feature can match the entire infrastructure cost of Segment, despite Segment’s smaller scale compared to ChatGPT.

OpenAI aims to compete in multiple fields: API, fundamental research, hardware, coding agents, image generation, and more, including undisclosed projects.

OpenAI pays close attention to Twitter. A viral tweet about OpenAI often triggers internal discussions. Some joke that “the company is driven by Twitter atmosphere.”

High team mobility and grounded leadership

OpenAI’s team is highly fluid. When Codex launched, experienced ChatGPT engineers quickly joined to meet deadlines. No quarterly planning delays—actions are swift.

Leaders are deeply involved. Executives like Greg Brockman, Sam Altman, Karpathy, Mark, and Dane participate actively on Slack. No one is a “figurehead.”

Like early Meta

OpenAI uses a monolithic codebase mainly in Python, with increasing Rust and some Go services, often for networking. Python’s flexibility leads to diverse coding styles, from Google veterans’ libraries to quick scripts in Jupyter notebooks. No strict code style enforcement.

All services run on Azure. Only three are reliable: Azure Kubernetes Service, CosmosDB, and BlobStore. No direct equivalents to AWS services like Dynamo, Spanner, or BigQuery. Auto-scaling is limited, and permissions are less robust, with the company preferring in-house solutions.

Talent flow from Meta to OpenAI is obvious. Infrastructure talent from Meta and Instagram is prominent, with similarities in systems like self-developed TAO (graph database) and edge identity projects.

Chat features are deeply integrated into the codebase. After ChatGPT’s success, many core components revolve around chat messages and dialogues. Codex is slightly different, focusing more on API logic, but still borrowing heavily from existing tech.

Code is king. Without a central architecture or standards committee, teams are encouraged to act quickly. This leads to code duplication, such as multiple libraries for queue management or proxy loops.

Rapid expansion and insufficient tools cause issues. Backend monoliths like sa-server are like “junk piles,” CI/CD pipelines often break, and tests can take half an hour. These are common problems in fast-growing companies, and efforts are underway to improve.

From first Codex line to release in just 7 weeks

Calvin also shared the Codex release story.

In November 2024, OpenAI set a goal to launch a coding agent by 2025. By February 2025, internal tools were effective, and many “ambient coding” tools emerged.

Calvin took early maternity leave, joined the Codex team, and a week later, two teams merged for a sprint. From first line of code to release took only 7 weeks, working late into the night, waking at 5:30 am for the newborn, and working weekends.

This speed is extraordinary. Few companies can go from idea to full product so fast. The project involved building container environments, optimizing code repositories, fine-tuning models for code editing, supporting git, developing new interfaces, and internet access—creating a highly usable product.

No matter your view of OpenAI, its “sprint and release” spirit remains strong.

The Codex team includes 8 senior engineers, 4 researchers, 2 designers, 2 marketers, and 1 product manager. No one needs micromanagement; coordination is key.

On the eve of release, five team members worked until 4 am deploying core services. The next morning, they prepared for launch and live broadcast. Once live, traffic surged. “I’ve never seen a product attract so many users just by appearing in ChatGPT sidebar—this is the power of ChatGPT.”

Codex’s deployment is fully asynchronous. Users initiate tasks, and agents run in isolated environments. “Our vision is that, in the future, users will treat coding agents as ‘colleagues’: send a task, and it will submit a PR.”

This is risky. The current models are good but not perfect; they can run for minutes but not hours continuously. User trust varies, and many are still unsure of the true capabilities. Calvin believes programming will increasingly resemble Codex in the future.

Codex excels at handling large codebases and multitasking. Compared to other tools, it can run multiple tasks simultaneously and compare results. Data shows that in 53 days, Codex generated 630,000 public PRs, averaging about 78,000 per engineer, with even more private contributions.

Departure reflections

Calvin admits he was initially hesitant about joining OpenAI. Giving up startup freedom, accepting management, and becoming part of a large machine was daunting. He kept a low profile at first, unsure if he could adapt.

He sought three things from OpenAI:

  • Understanding model training and future directions;
  • Working with top talent and learning;
  • Launching a great product.

He achieved all three. Additionally, he gained other insights:

  • The power of “large consumer brands”: At OpenAI, all metrics revolve around “Pro subscriptions.” Even tools like Codex are designed primarily for individual use, not team workflows. Once launched, traffic floods in.
  • Training large models: It’s a process from “experiments” to “engineering.” Small-scale tests lead to larger training, involving algorithm tuning and data optimization. Large-scale training resembles managing a massive distributed system with unexpected edge cases.
  • GPU cost management: When releasing Codex, predicting load capacity is key. The focus should be on latency, token count, and initial response time, not just GPU performance. Each iteration changes load patterns significantly.
  • Working with large Python codebases: When many developers maintain a repo, safeguards like defaults, branch hygiene, and misuse prevention are essential, enforced through standards and tools.

Finally, Calvin suggests that if you’re an entrepreneur feeling stuck, reflect deeply on how to make more progress or join top labs. Currently, the AI race is a three-way battle: OpenAI, Anthropic, and Google. Each has a different approach, and working at any one of them is eye-opening.

Reference links:https://calv.info/openai-reflections

Subscribe to QQ Insights

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe