Long short-term memory

Who submitted the diploma thesis that identified the vanishing gradient problem in 1991?

Sepp Hochreiter submitted a German diploma thesis in 1991 that identified the critical flaw known as the vanishing gradient problem. His supervisor Jürgen Schmidhuber considered this identification of the barrier highly significant for recurrent neural networks.

What are the three specific gates found inside an LSTM unit?

An LSTM unit typically consists of a cell and three specific gates: an input gate, an output gate, and a forget gate. These gates regulate information flow by mapping values between zero and one to decide what to retain or discard from previous states.

When did Felix Gers, Jürgen Schmidhuber, and Fred Cummins introduce the forget gate into the architecture?

Felix Gers, Jürgen Schmidhuber, and Fred Cummins introduced the forget gate into the architecture in 1999. This addition enabled the LSTM to reset its own state and deal with the vanishing gradient problem effectively.

How many percent did Google Voice transcription errors cut after using an LSTM trained by CTC in 2015?

Google started using an LSTM trained by CTC for speech recognition on Google Voice in 2015. The new model cut transcription errors by 49 percent according to their official blog post.

Which company used long short-term memory networks to perform some 4.5 billion automatic translations every day in 2017?

Facebook performed some 4.5 billion automatic translations every day using long short-term memory networks in 2017. This usage demonstrated the scale at which the technology could handle translation tasks during that year.

Long short-term memory.

Architectural Design And Gates

Up Next

Continue Browsing

Common questions

Who submitted the diploma thesis that identified the vanishing gradient problem in 1991?

What are the three specific gates found inside an LSTM unit?

When did Felix Gers, Jürgen Schmidhuber, and Fred Cummins introduce the forget gate into the architecture?

How many percent did Google Voice transcription errors cut after using an LSTM trained by CTC in 2015?

Which company used long short-term memory networks to perform some 4.5 billion automatic translations every day in 2017?

Evolution Of Variants

Training Methodologies

Industry Applications And Adoption