AI is quietly surpassing me in yet another aspect—this time, it has better pitch than I do!
Upon hearing this new song, I initially thought it was by a talented “Little Dao Lang”… It’s hard to explain in just a few sentences; let’s just listen:

Audio link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
The story goes like this:
A young man who has just finished his exams and graduated expresses his reluctance to part with his teachers and classmates. It captures the unique innocence and greenness of youth, while also harboring full anticipation for the future.
Impressive production, right? The dynamic rhythm, smooth melody, and emotional ups and downs all maintain a professional standard.
But would you believe it? From lyrics to arrangement, the entire song was generated with one click by AI.
Those “Little Dao Langs” simply expressed their thoughts in a single sentence, waited less than a minute, and produced a complete 2–6 minute track with stable structure, consistent pitch, and natural vocal timbre that doesn’t drift.
All of this comes from FreeQuantum, an AI company focused on self-developed music large models, which recently released its new model—YinChao V3.0.
Compared to the previous generation, YinChao V3.0 has achieved significant improvements in vocal quality, overall listenability and memorability, arrangement richness, and musical completeness.

Currently, YinChao V3.0 is officially available on the web platform and official App, open for free trial to all users.
In that case, it’s time for us at this website to put our skills to the test. Let’s dive in~
AI “Soul Singer” Writes Songs For You
Open the App, and you’ll see four creation modes: One-Sentence Songwriting, Photo-to-Song, Lyrics-to-Song, and Hit Song Remix.
Additionally, users can create custom voices, using their own voice to generate tracks. This means that even if you’re tone-deaf and have no songwriting skills, AI can help you debut on the spot.

Let’s start with “One-Sentence Songwriting.” It’s simple and straightforward: input a sentence describing the song style or content you want.
For example, we entered a prompt about wishing for good luck in the new year and driving away bad luck:
Go away! Bad luck, scatter! The exclusive battle anthem to banish misfortune.
If you’re unsure how to express yourself, the system offers “One-Click AI Polish” and “Inspiration Prompts” features, further lowering the barrier to entry.

After entering your prompt, you can flexibly choose between two creation modes based on specific use cases.
- “Snippet Mode” is designed for short-form content scenarios like short videos and social media sharing, directly generating compact segments with prominent climaxes.
- “Full Version Mode” generates mature works lasting 2–6 minutes, covering complete structures including verses, choruses, and interludes, making it more suitable for personal projects or deep expression.
The system automatically matches recommended voices based on the song style. If you’ve previously created a personalized voice profile, you can select it here to give your work greater individuality.

Once all settings are ready, click the “Generate Song” button.
In less than a minute, an entirely new song belonging to you is created. Let’s listen:

Audio link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
The prompt was understood well, the melody is clear, and the rhythm hits hard—it’s quite catchy (I’ve already started looping it).
The lyrics consistently revolve around the core theme of “bad luck goes away, good luck arrives,” with frequent golden lines:
“You shout once, I light a lamp; together our harmony overturns the night. We don’t wait for the wind to come; we generate our own.” “Pack up old worries and mail them to the Arctic Circle.” “Today, only accept deliveries, not bad news…”
This little AI is quite internet-savvy and knows how to write well.

English songs are also supported, with equally impressive results:

Audio link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
Of course, if you’re already skilled at writing lyrics or have existing ones, you can directly use the “Lyrics-to-Song” mode.
In this mode, you simply copy and paste your lyrics into the input box and make basic paragraph divisions. It supports various common structural sections such as verses, choruses, interludes, and bridges, and you can also use the built-in “Lyric Optimization” feature to polish them with one click.
Styles are set separately below the input box. The official platform provides multiple preset styles and also supports customization. Genres, emotions, instruments, languages (Chinese/English), and vocal gender can all be freely selected.

For example, simply input a short, romantic lyric snippet to have the AI optimize and polish it with one click, then compose music based on those lyrics. The final product is ready:

Video link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
As the vinyl spins softly, creating a lazy and intoxicating atmosphere, the vibe is instantly amplified.
The next feature, “Photo-to-Song,” is even simpler. You only need to upload a photo; there’s no need to write prompts or set styles. The model can interpret the image content and automatically generate matching lyrics and melody:

For instance, let’s select a random reference image to generate a short clip (using snippet mode). Let’s hear how it sounds:

Video link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
It can handle various musical styles with ease.
Here is another shot taken from inside a car on a road trip. Use this as the background music for your next social media post during travel:

Video link: https://mp.weixin.qq.com/s/yDWVdeQuGxPgXzNLcxFLvg
The fourth feature, “Hit Song Remix,” involves creating adaptations based on existing works. We won’t go into detail here, but interested users can try it out themselves.
Notably, all songs generated by Yinchao can be directly downloaded as audio or video files. The videos automatically come with AI-generated cover art and allow for editable song titles, making sharing extremely convenient.

After testing it thoroughly, Yinchao indeed makes music creation much easier than one might imagine.
People without knowledge of music theory, instrumental skills, or arrangement experience can simply write down their stories or emotions to generate a complete song with clear expression. Those daily fragments that once lingered only in the mind now have the possibility of being carried by music.
More importantly, it is not just easy to use; the quality of the final product withstands repeated listening.
The melodic progression feels natural, and the chorus has memorable hooks. The arrangement structure is coherent, lacking any obvious patchwork feel. Vocal details are handled with restraint, avoiding stiffness or an overtly mechanical AI sound. The overall listening experience exceeds that of a mere trial; these are polished works worth sharing.
So, how does Yinchao achieve this?
Behind the Scenes: The Synergy of Music and Technology
Venturing into the deep waters of AI music, pure algorithmic iteration often hits “the ceiling of aesthetics.”
Many models lack “musicality” because algorithm development struggles to translate abstract music theory into concrete optimization goals, creating a natural cognitive gap between technology and art.
At Yinchao, this barrier has been completely broken.
Its members are passionate about contributing their musical insights. From complex music theory logic to nuanced arrangement aesthetics, they actively participate in every technical discussion, precisely “translating” intuitive musical feelings into rational algorithmic language.
This cross-disciplinary integration effectively compensates for the lack of understanding of music itself from a purely technological perspective, injecting professional musical knowledge directly into the bloodline of algorithmic iteration.
It is this “Music + Technology” dual-helix drive that ensures Yinchao V3.0’s generation is no longer unidirectional reasoning but a creative act built upon theoretical musical cognition.
So, in what specific aspects has Yinchao V3.0 upgraded?
First and foremost, the most intuitive change is a significant improvement in vocal quality.
By introducing the team’s self-developed dual-track modeling mechanism[1], Yinchao V3.0 separates the modeling of vocals and accompaniment. It learns features in different semantic spaces before fusing them at a higher structural level.
This approach avoids information interference between vocals and accompaniment while precisely matching their synergistic relationship in rhythm and harmony.

Building on this, the team introduced layered enhancement strategies and hybrid training objectives from its self-developed HEAR framework[2]. This ensures accurate replication of vocal techniques such as melisma and glissando while strengthening the model’s ability to perceive song emotions through hierarchical learning of musical aesthetics. The model learns expression logic across different aesthetic dimensions, rather than merely satisfying itself with “singing in tune.”
The resulting experience shows clear differentiated advantages: it no longer just sings out lyrics but adjusts vocal tone based on semantics and context—sadness is not just slow; it is restrained emotion. Passion is not just high-pitched; it is driven tension.
Vocals begin to possess narrative capability.

Secondly, changes at the melodic level are equally noticeable.
A common problem in current AI music is that while it sounds smooth, it lacks memorable hooks.
Yinchao V3.0’s melody generation mechanism significantly enhances motif design capabilities. The distribution of tension between notes demonstrates greater structural awareness, and the relationship between climaxes and build-ups becomes clearer.
clear, making the chorus easier to form a recognizable hook.
In other words, it has begun to possess the ability to “write choruses.” Melodies are no longer just linear flows; they intentionally build highlight moments, creating anchors for both emotion and auditory experience in the work.
Once the melody and vocals are firmly established, the overall coherence and diversity of the arrangement also improve.
YinChao V3.0 has matured in style modeling, automatically matching more appropriate instrumentation strategies based on different music genres. Instruments no longer simply stack upon one another; instead, they divide labor around the main melody. The transitions between sections are more natural, bridge connections are smoother, and rhythmic layers are clearer.
At the same time, the “physical texture” of sound has been re-polished. YinChao V3.0 employs its team-developed ϵar-VAE[3] core technology to independently model spatial information, applying this high-fidelity reconstruction solution throughout the entire generation pipeline.
ϵar-VAE introduces representation and supervision methods for spatial information, accurately restoring design details involving temporal-spatial shifts found in high-quality music—such as Tom fills in drum sections of an arrangement or automated panning/reverb movements of instruments in mixing.
The impact of drums, the graininess of electric guitars, and the spatial layers of reverb are all clearer than before. The listening experience is no longer just a simple high-fidelity frequency response but truly restores the complex layering arrangements and spatial designs within the music.

These improvements, when combined, bring changes that are not just breakthroughs in single areas but an overall upgrade in listening experience.
However, technical challenges do not end with generation.
Music evaluation is inherently a highly subjective field, lacking absolute objective automated metrics.
To address this, FreeScale established a professional evaluation team and built a fine-grained review system.
The review dimensions are extremely detailed, covering melodic motifs, vocal performance (particularly the handling of tones and emotions unique to Chinese), arrangement richness, instrument sound quality restoration, and overall stylistic unity.
They also constructed a large-scale reinforcement learning annotation database, mapping human aesthetics into the model’s parameter space to achieve “human-machine aesthetic alignment.”
Dr. Jiang Tao, CTO and Executive CEO of FreeScale, stated that aesthetic alignment is a core challenge: “How to converge the tastes of annotators from different backgrounds into a universal, credible consensus on aesthetics, and use data to help models truly understand this beauty?” They iterated through countless versions in this process, with the ultimate goal of making AI’s creative judgment infinitely approach the industry intuition of senior musicians.
A series of achievements by the team has now received positive validation from authoritative international academic platforms.
At ICASSP 2026, an international top conference in acoustics and audio, the results of the inaugural “Automatic Song Aesthetics Evaluation Challenge” were announced. FreeScale’s AI music evaluation system (BAL-RAE) stood out among fierce competition from multiple global research teams, securing second place globally in Task 1 (Comprehensive Song Aesthetic Scoring).

From the pioneering days with no models available to now reaching industry-leading standards in key dimensions such as “human-like feel,” musicality, and arrangement richness, FreeScale’s true technological moat comes from its long-term and steadfast full-chain investment in underlying model architecture, data, and aesthetic alignment.
It is worth noting that this investment has not been closed off.
The music industry itself is a relatively closed-source ecosystem, with commercial companies’ technical solutions mostly hidden behind walls. FreeScale could have kept its self-developed system under wraps to advance quietly, but they chose to open-source some of their research results and modules externally.
On the product side, enabling everyone to write songs; on the technical side, paving the way for more teams. For a company that already has commercial solutions to still be willing to contribute its technical details and components to the open-source community is itself a rare feat.
More open-source achievements can be found on FreeScale’s technical team ear-lab homepage: https://eps-acoustic-revolution-lab.github.io/ear-lab

Technology reaching this level is often not accidental. Looking back along the model and product lines, the team behind it is actually worth discussing in more detail.
A group of people who understand music, aiming to let everyone express themselves through music
During conversations with the team, a very direct feeling emerged: they approach music AI not starting from how powerful the model capabilities are, but from the act of music creation itself.
FreeScale was founded in 2023, focusing on AIGC and multimodal large model R&D. The core team is highly distinctive: every member is a musician.
CTO and Executive CEO Jiang Tao jokingly said, “Our algorithm team could form a band; we have enough people for wind, string, percussion, and vocals.” Guitars and Populeles are grabbed casually from desks, allowing developers to jam out a few bars during coding breaks.
The head of the professional evaluation team, although trained in engineering, is also a musician who has written lyrics and composed music for top-tier artists. This role serves as a two-way translator—understanding both the emotional tension and stylistic expression within musical language, as well as the metric logic and optimization paths within algorithmic systems, bridging the cognitive gap between these two fields.
Interestingly, this collaboration often produces interesting collisions. Jiang Tao revealed that sometimes, from a musician’s perspective, a generated piece feels highly infectious, while the algorithm side sees it failing to meet standards based on spectrograms or structural metrics; conversely, certain “fuzziness” in recorded instruments might be counted as defects by technical indicators but sounds more realistic to the ear.
It is precisely this continuous tug-of-war that allows the product to find a balance between technological control and emotional surprise.
…achieved a dynamic balance.

Growing upward, they refine the human touch and texture of their models; rooting downward, they deploy these capabilities to places closest to ordinary people. For the “free-tier” model, these two objectives are never disconnected—the higher technology ascends, the more firmly it must ground itself in practical application.
Currently, YinChao has entered the supply chain for music generation interface services of multiple manufacturers, covering areas such as music creation tools, MV generation, and image-to-video conversion. Offline collaborations with KTVs are also underway, meaning users may soon be able to sing their own AI-original songs in private booths. Even the official theme song for the 2025 World Artificial Intelligence Conference (WAIC), AI For Good, was fully supported by the YinChao large model from lyric writing and composition to vocal performance.
“Music consumption is layered, scenario-specific, and audience-targeted,” the team stated. “Our service sweet spot lies precisely where it is closest to everyone.”
At a philosophical level, they emphasize enabling everyone to create music; at a mechanistic level, YinChao’s user agreement explicitly assigns copyright of AI-generated music to users, providing assistance with copyright certification for creators. From professional stages to KTVs, from film scores to background music on social media feeds, music is undergoing a transformation in its tool-like form.
Dr. Jiang Tao expressed confidence and resolve: “Ride-hailing drivers and delivery riders have stories and ideas; they lack tools. They could very well be the ‘Jay Chous’ of this era.”