Born in 95, Innovating Business While Publishing Top AI Papers — An Unprecedented Experience}

Young AI talents born after 1995 are simultaneously transforming industries and publishing top conference papers, showcasing rapid growth and innovative spirit in the AI field.

Born in 95, Innovating Business While Publishing Top AI Papers — An Unprecedented Experience}

In the wave of the AI era, top talents wield unprecedented influence, elevating their status through significant market impact. From Google’s Transformer papers to scientists leaving OpenAI, they either establish their own ventures with billion-dollar investments or join other companies, narrowing the technological gap and reshaping competitive landscapes.

The rapid growth of supply for these talents seems unable to keep pace with the explosive demand from internet giants and startups, giving them strong bargaining power. Companies are pulling out all stops to attract these breakthrough innovators who can lead the industry or solve key bottlenecks.

This ongoing talent arms race manifests as a high-intensity, systematic, and global competition within China as well. Major internet companies are launching aggressive talent programs: JD’s TGT Top Youth Talent Plan, ByteDance’s Top Seed Talent Program, Tencent’s Qingyun Plan, Baidu’s Wenxin・Xinxing Plan, and more. These initiatives offer industry-leading salaries, sometimes without caps, aiming to bring top talents under their wings.

Achieving a win-win situation for both companies and talents requires “mutual effort.”

Recently, at an offline tech salon gathering industry leaders and university tech geniuses, deep discussions on cutting-edge technology directions and shared aspirations for talent development took place.

图片

      JD Tech Salon Retail Session

This was the final event of JD’s 2023 global “JD Tech Salon,” where several JD retail internal large model teams shared top conference papers, real-world case studies, and showcased the latest technological advances and innovative applications.

Senior technical experts from core departments also shared their experiences and insights, helping newcomers quickly understand JD’s rich business scenarios and how to find the most suitable teams and roles.

How can newcomers rapidly grow into technical backbone members, transitioning from academic research to industry practice? Driven by curiosity, we spoke with five young technical experts from JD Retail R&D, born between 1998 and 1992. Their experiences may offer valuable references for those about to enter the workforce.

One-Year Newcomer Journey: Overcoming Challenges and Embracing Hard Problems

Luochuan, a 27-year-old male.

In 2024, after earning a PhD in Computer Software and Theory from UCAS, he joined JD’s Retail AI Infrastructure team.

Like many newcomers, Luochuan was initially anxious transitioning from campus to the workplace.

But a comprehensive support system completely dispelled his worries.

Luochuan has two mentors—one in business, one in technology—who meet with him monthly to discuss personal growth and technical issues. Soon, he began systematically familiarizing himself with the team’s tech stack, codebase, and work rhythm.

图片

      Luochuan (left 3) and colleagues team-building

After a few months, Luochuan, now past the initial stage, eagerly wanted to apply his research to real-world problems. “Most academic research stays on paper, but JD’s rich business scenarios and massive industry data finally gave my research a chance to scale up,” he said.

Having familiarized himself with the business, Luochuan began to identify pain points in his technical field. His team is responsible for building and optimizing infrastructure supporting large-scale AI applications, including cluster management, compute scheduling, data and sample centers, and inference engine optimization.

Through long-term observation of JD’s e-commerce platform and guidance from mentors, Luochuan set clear goals. He found that since the advent of large models, recommendation systems have benefited from Scaling Laws. However, as sparse parameters in recommendation models grow and user behavior sequences reach tens of thousands or even hundreds of thousands, storage, communication, and query costs become bottlenecks, affecting iteration efficiency.

Luochuan was eager to tackle this challenge. After understanding core business needs and technical difficulties, he analyzed existing solutions and devised a feasible plan tailored to the business scenario.

He and his team designed a perceptual quantization and caching scheme that significantly reduces storage, communication, and query overhead for sparse parameters, greatly accelerating distributed training of CTR models. Seeing the results in practice, Luochuan felt his efforts paid off: “My hard work was worth it.”

This is just a snapshot of Luochuan’s year. Now, he has found his role and is tackling one technical challenge after another with the AI Infra team. “As a newcomer, you must overcome fear of difficulty, dive deep into a field, and be brave to tackle tough problems,” he said.

Three-Year Progress: Focusing on Real Pain Points, From Passive Problem-Solving to Proactive Questioning

Luochuan’s “first-year” experience resonates with his seniors, Qian Yi and Tian Ye, who joined JD three years ago after completing their PhDs at the Institute of Automation, CAS.

Qian Yi focuses on image generation, multimodal large models, and OCR, while Tian Ye works on search relevance and NLP applications in search scenarios. Both joined JD in the advertising and search technology departments.

When they first joined, they faced different challenges.

Tian Ye said his biggest challenge was transitioning from academic research to industry engineering—due to fundamental differences in problem definitions and data systems across environments.

Academic research typically targets well-defined tasks with fixed datasets and metrics, focusing on improving indicators on specific data. In contrast, in the fast-changing e-commerce search industry, core problems evolve rapidly, and engineers must build data loops independently without ready-made datasets.

This transition was tough. “I had to force myself to shift from a pure problem solver to a problem setter and architect, capable of continuous business insight and dynamic data and evaluation system construction,” Tian Ye shared.

He took time to adapt but then thrived, leveraging his expertise to upgrade search experience. His biggest concern was GPU capacity, but JD’s flexible resource allocation strategy supported long-term projects. “Now, with computing resources secured, I can fully focus on generative search technology,” he said.

Meanwhile, Qian Yi initially thought her solid technical foundation was enough, but she soon realized that business needs and tech iteration speed in industry far outpaced academia.

She highlighted her work on a trusted image generation system for e-commerce ads—using RLHF to incorporate human feedback, reducing issues like product deformation and background misalignment, thus improving image quality.

图片

      RFFT achieved SOTA [1] compared to other methods

Qian Yi gradually broadened her technical horizon, mastering more knowledge and skills to meet diverse business challenges. In just three years, she published over ten innovative papers, received multiple top AI conference and journal acceptances, and is now exploring how generative AI can empower advertising creativity, especially in multimodal model automation.

Both Tian Ye and Qian Yi have grown rapidly by addressing real user pain points, gaining valuable experience in solving business challenges.

95 Post-00s: Boldly Innovating and Exploring Multiple Possibilities

Within JD’s retail tech team, many young engineers like Zhang Lin and Dao Yu, born after 1995, frequently exchange ideas and collaborate with the three colleagues mentioned above.

Zhang Lin focuses on model distillation and data selection, emphasizing training and scaling under resource constraints. He notes that modern deep learning and large models rely heavily on massive data, parameters, and high computing costs, making low-resource training very challenging. “Academic training simplifies problems into clear mathematical questions, but today’s issues involve intertwined technology, business, resources, and personnel,” he said.

He actively seeks advice from mentors and seniors, embracing the freedom and patience they offer. “Don’t be afraid—there are many ways to achieve goals. Be bold to think, act, and explore different possibilities. We will support you,” he often hears.

He proposed training on a carefully selected subset of the most informative samples to balance performance and efficiency. Experiments showed that sampling only 70-80% of data maintained accuracy comparable to the full dataset, outperforming other selection methods.

图片

      Breakthrough in Power Law via Data Selection [2]

Recently, Zhang Lin’s three papers were accepted at top conferences ICLR, AAAI, and ACL, and he filed eight patents. One notable work involves accelerating model training through dynamic data selection.

Similarly, Dao Yu, also born after 1995, focuses on productizing large language models. In e-commerce, she develops models for generating marketing copy, helping users shop and providing professional product recommendations. She emphasizes the importance of emotional value in language, proposing to incorporate personalized language styles to enhance user engagement. Her ideas were quickly adopted by her team, boosting morale: “Here, communication is barrier-free, and industry leaders participate directly, which greatly improves user experience.”

These young talents, driven by free thinking and solid engineering practice, are actively solving real business pain points, rapidly growing and making significant contributions.

Post-95s: Dare to Think, Dare to Act, Explore Possibilities

Many other young engineers like Chang Lin and Dao Yu are also exploring diverse paths, constantly pushing boundaries and innovating.

Chang Lin’s research on model distillation and data selection tackles the challenge of training large models with limited resources. He advocates for selecting the most informative samples, achieving comparable accuracy with only 70-80% of the data, demonstrating efficient resource use.

Dao Yu’s work on multimodal generative models aims to automate creative processes in advertising, emphasizing emotional and personalized language styles to better connect with users. Her team’s efforts have been recognized and supported by leadership, fostering a culture of innovation.

In just three years, these young talents have published over ten papers, obtained multiple patents, and are actively exploring cutting-edge AI capabilities to empower business growth.

JD’s talent development initiatives, including the “Doctor Training Program” launched in 2017 and the recent “JD TGT Top Youth Talent Plan,” aim to nurture more young talents with no salary caps, covering fields like multimodal models, machine learning, search, spatial intelligence, high-performance computing, big data, AI infrastructure, and security.

JD hopes to create a nurturing environment where young talents can grow rapidly, contribute to industry innovation, and help build a unique technological moat that sustains competitive advantage in AI, big data, and cloud computing.

Finally, click here to apply for the JD TGT Top Youth Talent Plan and create a better future with technology!

References:

Subscribe to QQ Insights

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe