In the previous article I told you about some problems that can be solved using Artificial Intelligence, but consuming them as services, with no need to train models or worry about the algorithms or the architecture of the neural networks to use.
In this article I will tell you what we are doing to support these services in GeneXus, so that they can be easily used in applications.
Warning: some of the contents of this post are still in the definition stage, so they may change by the time the functionality is released in GeneXus.
GeneXus’ mission is to help people develop the best applications in the simplest way possible.
This mission has two aspects which are relevant in the context of this note.
First of all, we want developers to be able to make the best applications possible. Today and in the years to come, the best possible applications will undoubtedly have to incorporate artificial intelligence components. That’s why I think it’s important that GeneXus integrates this feature.
On the other hand, we want this development to be as simple as possible. While Cloud providers –as we said in the previous article– provide artificial intelligence services that are easy to use, each one has its particularities.
What we are doing in GeneXus is defining a common API that can be used to develop applications, no matter which provider is eventually used. This philosophy is part of GeneXus’ philosophy in all its aspects, where the developer can work in the same way regardless of the programming language that is used (C#, Java or .Net Core), which database is used (SQL Server, Oracle, MySQL, PostgreSQL, etc.), or which smart device platform the application will be used on (Android or iOS).
The API provided by GeneXus will have several services, to which functionalities will continue to be added as it becomes available.
Note: This common API is in the design stage at the moment, so not all details are available.
Its functionalities can be grouped into three categories: text, image and audio.
The text features we plan to include are as follows:
- Language detection: given a text, it determines in which language it is written and an indicator of confidence in the result.
- Sentiment analysis.
- Automatic translation: given a text in one language and the language in which you want the translation to be made, it gets the translated text.
- Entity Extraction: given a text, it extracts the relevant entities from it, such as names, countries, categories, etc.
As for images:
- Scenario recognition: given an image, it determines what type of scenario it is (city, country, beach, etc.).
- People recognition: this may include detection of faces, facial gestures (smile, anger, etc.), or people’s tags.
- Emotion recognition: given an image, it recognizes how many faces there are and their emotions.
- Object recognition: given an image, it determines which objects appear in it (with their tags and a percentage of confidence) and the position of each one.
- OCR: given an image with text in it, it extracts the text from it.
- Image classification: given an image, it determines what the image is about.
- Similarity scoring: two images are entered and the result is a percentage of similarity between them.
Lastly, audio features are as follows:
- Text to speech
- Speech to text
Initially, we’re working with three providers: Microsoft Cognitive Services, IBM Watson and SAP Leonardo.
Although these are the first ones we will have, others will be added, such as Amazon Web Services and Google Cloud.
We’re also planning to have providers which are “local” to smart devices, such as TensorFlow Lite (Android and iOS), CoreML (iOS) and ML Kit (Android and iOS). For the moment, though, we are focusing on cloud providers.
We at GeneXus believe in simplifying the development of applications as much as possible, and to this end we are working with this new artificial intelligence API.
Stay tuned for new feature announcements in upcoming GeneXus upgrades.