Google to Facilitate Development of Generative AI
Cinthya Alaniz Salazar - Mon, 05/15/2023

Google has recently unveiled a significant expansion to its Vertex AI portfolio by introducing three new foundation models. Through this move, the company aims to provide developers and data scientists with enhanced capabilities to build generative AI applications. 

In the current AI race, where technological advancements are evolving at an unprecedented pace, Google's new tools have been designed to help developers create “bold and responsible” AI applications, supported by enterprise-grade safety, security and privacy, according to a recent press release. 

“With our new foundation models available in Vertex AI and our expanding toolset for customizing and leveraging those models, we are continuing to transform how organizations across all industries and levels of technical expertise build and interact with AI in the cloud,” reads the press release. With these latest developments, Google is reinforcing its commitment to democratizing AI and making it accessible to organizations of all sizes and technical capabilities.

At Google I/O 2023, the company announced that its Vertex AI tool set would expand to include: Codey, Imagen, Chirp, Embeddings API for images and Reinforcement Learning from Human Feedback. Let us explore the potential benefits and limitations of each tool, and how they fit into the larger landscape of AI development and innovation.



Codey is a text-to-code foundation model supporting over 20 coding languages that stands to accelerate software development with real-time code completion and generation, customizable to a customer’s own codebase. By streamlining the coding process and eliminating the need for manual coding, Codey empowers organizations to build more sophisticated and cutting-edge AI tools faster and more efficiently than ever. This represents a major step forward in the world of generative AI, offering organizations a powerful tool to build better and more advanced applications.

Limitations to consider start with Codey’s limit of 20 coding languages, which foreshadows that it may not be fully compatible with certain specialized or niche languages like Rust, Erlang and Lua. Moreover, while Codey may be able to generate code quickly and efficiently, some situations may demand manual coding to fine-tune and optimize the code generated by the tool. Finally, while Codey offers a powerful solution for text-to-code conversion, there may be other factors that affect the overall performance and efficiency of an organization's generative AI applications, such as the quality and accuracy of the training data used to develop the models.


Imagen is a text-to-image foundation model that offers a versatile solution for a variety of use cases. With the ability to generate and edit high-quality images at scale, as well as caption and classify images in over 300 languages, Imagen has the potential to revolutionize how organizations approach image creation and management. Additionally, its built-in content moderation features ensure that the images generated are safe and appropriate for all audiences, making Imagen a reliable risk management tool for companies seeking to protect their brand and reputation.


Chirp is a speech-to-text foundation model that brings the power of large models to speech tasks to help companies engage with customers and constituents more inclusively in their native languages. Trained on millions of hours of audio, Chirp supports over 100 languages and holds a 98% accuracy on English and relative improvement of up to 300% in languages with less than 10 million speakers. By leveraging Chirp's advanced speech-to-text capabilities, organizations can create more accessible and personalized customer experiences that cater to a diverse range of language needs.

Embeddings API

This integration converts text and image data into multidimensional numerical vectors, allowing for semantic relationships to be mapped and processed by large models. This is particularly useful for longer inputs, such as texts with thousands of tokens. By facilitating the identification of latent variables through improved clustering, anomaly detection and sentiment analysis, Embeddings API empowers developers to enhance user experience and gain insights into their data. By facilitating the identification of latent variables through improved clustering, anomaly detection and sentiment analysis, Embeddings API empowers developers to unlock the full potential of large models. 

Reinforcement Learning from Human Feedback (RLHF)

A first of its kind among hyperscalers, RHLF is a unique tuning feature that allows organizations to incorporate human feedback to train a reward model for fine-tuning foundation models. It also enables human reviewers to evaluate model responses for bias, toxicity or other issues, thus teaching the model to avoid inappropriate outputs. This feature is particularly beneficial for industries that require a high level of accuracy, such as healthcare, or where customer satisfaction is critical, such as finance and e-commerce. As a managed service offering, RLHF has the potential to cost-efficiently maintain model performance over time and deploy safer, more accurate and more useful production models.


The new generative AI tools developed by Google are poised to have a significant impact on the field. These tools offer a level of sophistication and versatility that was previously unavailable to researchers and developers. By allowing users to generate high-quality content across a variety of domains, from text to images to music, these tools have the potential to democratize access to AI technology. This could lead to a proliferation of new applications and use cases, as more individuals and organizations incorporate AI into their workflows. However, the impact of these tools on the broader AI ecosystem remains to be seen.


