Alexa meetup 2019 in Haarlem Tech, the afterword

Tags: development, techlabs, techtorials
updated: Jan 24, 2020 at 1:25PM | published: Apr 29, 2019

Oh dear, oh dear… At 4BIS, we love technology, we live technology, so much so that we totally forgot about the other thing that unites crowds and connects humans: football. On Tuesday April 16, while the Ajax football club were offering an unexpected victory to their many fans, we and the Haarlem Tech community were having another fun evening meet-up about Artificial Intelligence, wondering where everyone else was at.

Not to worry, a few people still showed up, and there was pizza leftover which, let’s face it, is worth missing football for any day. Anyway. This isn’t an article to declare my undying love for pizzas, I’ve already written too many of these in my young days. This is for those of you who couldn’t make it to the meet-up, to summarise a bit what was said, what we learnt, and how we ended up somehow discussing the eventual future robot uprising with pints of delicious Dutch beer in our hands…

Complete science fiction just a couple of decades ago, speech recognition is today one of the most remarkable advancements in the field of Artificial Intelligence. Thanks to Voice Command Devices, humans are now able to have live conversations with virtual assistants on smartphones and smart speakers, without even pressing a single button. The ultimate goal in the quest to humanise technology is to teach computers to mimic human interactions in the most natural way possible, with emotions and the whole shebang. Can you imagine if your home assistant device could order you to do the dishes or complain about your smelly feet, just as realistically as a real human girlfriend would? Well, this might be happening sooner than we think…

At the meet-up, we focused on one particular virtual assistant, pioneer and very successful on the market: Alexa, developed by Amazon Web Services. Alexa is a superstar as far as Voice User Interface (VUI) goes; it is capable of voice interaction, music playback, making lists, setting alarms and scheduled reminders, streaming podcasts, playing audiobooks, providing real-time information such as news, traffic, sports or weather… it can even control several smart devices inside your house and be used as a home automation system, managing lighting, home security, temperature, etc. If Alexa is currently “only” programmed to understand English, German, French, Italian, Spanish and Japanese, it was announced by Amazon in January 2019 that over 100 million Alexa-enabled devices have already been sold.

And it’s not going to stop there. The possibilities are endless. Alexa is so versatile that users can extend their Alexa’s capabilities by installing “skills” – aka, additional functionality – themselves. Literally anyone can go ahead and build an Alexa Skill. Amazon has made available to the public an extensive collection of tools, tutorials, and APIs to help users taking their first steps within the fascinating world of machine learning. It could be you, check it out here.


We invited Alexa experts to give presentations at the meet-up and with them we discussed the different levels of software infrastructure for the Alexa Voice User Interface, and how user-friendly it really is.

Our first speaker, who kindly came all the way from Madrid just for us, was Alexa Technical evangelist German Viscuso. Experienced developer, technical writer and consultant, he is dedicated to make technology approachable to others and develop communities around it.

Our second speaker was Jieke Pan, passionate about team leadership and extreme programming, and director of engineering at Mobiquity Inc, a digital consulting company that partners with the world’s leading brands to design and deliver compelling digital services for their customers. Jieke told us specifically about the GoodNes skill that Mobiquity developed for the giant Nestlé using Alexa VUI: GoodNes combines voice and visual to create a smart cooking assistant and provide the ultimate cooking experience to users.


Understanding how Alexa Skills work


Scenario: Mrs. Jones is 56 years old; she has three children and two cats. She loves pizzas, Stephen King’s books and quiet walks in the forest. She uses Alexa every day and built a “New York Pizza” skill on her device.

1/ Automatic Speech recognition (ASR)

With ASR Technology, Alexa can detect and convert spoken words into computer text.

ASR can also be used for authenticating users via their voice: Mrs. Jones can register her voice when she first configures her device and that trains it to recognise her speech patterns and vocabulary and respond to these. That is to prevent her cats from trying to take over and destroy the world, obviously. This aspect of the technology, however, is sensitive to privacy laws – can a private company like Amazon really record and retain extremely personal customer information, such as a voice? For this reason individual voice detection is only available in the US and is still at the beta phase (not in production). 

2/ Natural language understanding (NLU)

NLU is the post-processing of text, that identifies words and utilises context to discern meaning from a voice command. After recognising the information, NLU can derive the intent (Mrs. Jones wants pizza) and send the correct output message (a structured representation of Mrs. Jones’ request) to the needed service (the pizza skill) for it to execute the action.

3/ Text-to-speech (TTS)

Alexa includes a text-to-speech (TTS) system that assigns phonetic transcriptions to written words and then send them to a synthesiser which converts the linguistic representation into sound. That’s how Alexa can speak!

Stuff to look forward to


Above is the basic cycle of events when Alexa and a human are interacting, but of course Alexa development teams are constantly looking to further improve the technology and make Alexa’s spoken delivery sound as natural as possible. Throughout the meet-up our speakers explored with us some of the exciting things that may well be in Alexa’s near future.

Artificial Intelligence is equipped with artificial neural networks which are either standard feedforward (information is moving in one direction), or recurrent (information is moving in a cycle). The latter totally revolutionised deep learning:  Long short-term memory (LSTM) recurrent neural network makes the machine capable of not only processing but also remembering information. A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell. To put it simply, LSTM can predict what’s coming based on past “experience”. This is a fantastic little piece of magic for an interface like Alexa, which thanks to LSTM can learn over time and produce increasingly human-like sentences the more it interacts with humans.

Beyond voice context, LSTM also has applications for many more fun things, which are still in the experimenting phase. For instance, acoustic event detection could be such a strong asset for home security systems: imagine if Alexa was able to detect the sound of glass breaking, or a smoke alarm…

Although so far virtual assistants continue to speak in a relatively robotic, monotonous, emotionless way, it won’t be too long until the programming wizards of planet Earth manage to develop the technology to give the perfect illusion of a natural, spontaneous conversation, hence creating the ultimate user experience. All the big boys in the market are working hard on this and Amazon Alexa is no exception. Their latest text-to-speech system, which uses a generative neural network, can learn to employ a newscaster style from just a few hours of training data. The new whisper mode is available since 2018, and is quite handy when you want to speak to Alexa without waking up your wife next to you. These are just two examples of how fast Alexa is changing, and it’s just the beginning!

While enjoying our well-deserved post meet-up drinks, we had lively debates and very interesting discussions regarding the future of Artificial Intelligence and how far humans are going to take it. It was overall a great evening. 

We are happy to read your thoughts in the comments section below, and we hope you can join us at the next meet-up!

Have a nice day,

The 4BIS team.

Smart portal the answer for your company’s growth

Smart portal the answer for your company’s growth

In the beginning of my career I've dived directly into the world of enterprise management software development. ERP, CRM, interconnections, gateways, dashboard and what not more. It is always an interesting question that I still ask myself every time when I'm advising...

read more on #Horecava2020 on #Horecava2020

Today is a nice day for Lagosse Chocolate. They are launching their new product on the # Horecava2020 worked together with StyleMathot on this e-commerce tool that allows customers to personalize the wraps of their chocolate!...

read more

No Results Found

The page you requested could not be found. Try refining your search, or use the navigation above to locate the post.

Software ontwikkeling op maat

Hoe werkt het?

Wat wonderen doet voor het ene bedrijf kan zinloos zijn voor een ander bedrijf. Daarom hebben we oplossingen op maat die het unieke karakter van jouw bedrijf respecteren. We werken in een positieve spiraal van testen, monitoren en verzamelen gegevens om precies te weten te komen wat voor jou werkt en wat niet. Het is onze ‘whole package’-mindset, een aandacht voor details die ons in staat stelt om elke keer jouw prestatiedoelstellingen te bereiken. Dus ontspan en geniet van de rit!