training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of an acoustic model which includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments. The method further includes enabling a system to train an acoustic model using the text segments and the depersonalized audio features. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.
|
1. A method of enabling training of an acoustic model, the method comprising:
dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments, the depersonalized audio features including filtered audio data remaining after speaker vocal characteristics and other audio characteristics have been removed, the speech corpus comprising a plurality of messages that each contain audio and corresponding text content, the shredding splitting each of the plurality of messages into strips, each strip comprising text segments and corresponding depersonalized audio features;
mixing up the strips of the text segments and corresponding depersonalized audio features to produce strips mixed up in randomized order; and
enabling a system to train an acoustic model using the strips mixed up in randomized order.
14. A system for enabling training of an acoustic model, the system comprising:
a shredding module configured to shred a speech corpus dynamically to produce text segments and depersonalized audio features corresponding to the text segments, the depersonalized audio features including filtered audio data remaining after speaker vocal characteristics and other audio characteristics have been removed, the speech corpus comprising a plurality of messages that each contain audio and corresponding text content, the shredding splitting each of the plurality of messages into strips, each strip comprising text segments and corresponding depersonalized audio features;
the shredding module further configured to mix up the strips of the text segments and corresponding depersonalized audio features to produce strips mixed up in randomized order; and
an enabling module configured to enable a system to train an acoustic model using the strips mixed up in randomized order.
20. A computer program product comprising a non-transitory computer-readable medium storing instructions for performing a method of enabling training of an acoustic model, the instructions, when loaded and executed by a processor, cause the processor to:
dynamically shred a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments, the depersonalized audio features including filtered audio data remaining after speaker vocal characteristics and other audio characteristics have been removed, the speech corpus comprising a plurality of messages that each contain audio and corresponding text content, the shredding splitting each of the plurality of messages into strips, each strip comprising text segments and corresponding depersonalized audio features;
mix up the strips of the text segments and corresponding depersonalized audio features to produce strips mixed up in randomized order; and
enable a system to train an acoustic model using the strips mixed up in randomized order.
2. The method according to
extracting audio features from the speech corpus; and
depersonalizing the audio features.
3. The method according to
4. The method according to
5. The method according to
6. The method according to
7. The method according to
8. The method according to
9. The method according to
10. The method according to
11. The method according to
12. The method according to
13. The method according to
storing each text segment together with its corresponding depersonalized audio feature; and
randomizing the text segments and corresponding depersonalized audio features.
15. The system according to
extract audio features from the speech corpus; and
depersonalize the audio features.
16. The system according to
17. The system according to
18. The system according to
19. The system according to
store each text segment together with its corresponding depersonalized audio feature; and
randomize the text segments and corresponding depersonalized audio features.
|
This application is related to U.S. application Ser. No. 13/800,738, entitled “Data Shredding for Speech Recognition Language Model Training under Data Retention Restrictions,” filed on Mar. 13, 2013. The entire teachings of the above application are incorporated herein by reference.
A speech recognition system typically collects automatic speech recognition (ASR) statistics to train the speech recognition system. The ASR statistics can be used to train language models and acoustic models, which may be employed by the speech recognition system. In general, language models relate to the probability of particular word sequences. Acoustic models relate to sounds in a language.
A method or system of enabling training of an acoustic model according to an example embodiment of the present invention includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments; and enabling a system to train an acoustic model using the text segments and the depersonalized audio features.
In an embodiment, the method includes extracting audio features from the speech corpus and depersonalizing the audio features. Various operations and/or processes for depersonalizing the audio features are described herein and may be applied in combination. Depersonalizing the audio features can include applying cepstral mean subtraction (CMS), cepstral variance normalization, Gaussianisation or vocal tract length normalization (VTLN) to the audio features.
Alternatively or in addition, depersonalizing the audio features can include using a neural network to depersonalize the audio features. Using a neural network to depersonalize the audio features can include using a neural network system based on trainable features where the features which are trained to produce the posterior probability of the current frame of input (or a frame with a fixed offset to the current frame) correspond to one or more of a set of linguistic units including word and sub-word units. The linguistic units can, for example, include phone units, context-dependent phone units, grapheme units and the like. A phone unit is a sound of the language or speech and a grapheme unit is a character, e.g., a letter of an alphabet. The depersonalized features can be a fixed linear or non-linear transform of the trainable features. The depersonalized features may be produced via an intermediate “bottleneck” layer created to produce features in trainable structures, such as multi-layer perceptrons, deep neural networks and deep belief networks.
Alternatively or in addition, depersonalizing the audio features can include applying one or more (e.g., a set of) speaker-specific transforms to the audio features to remove speaker information. The types of speaker-specific transforms that may be used can include linear transforms, such as constrained maximum likelihood linear regression and variants, along with speaker-specific non-linear transforms.
Dynamically shredding the speech corpus may include aligning text and audio of the speech corpus and splitting the text and audio at convenient places, such as natural breaks in the speech corpus, for example, breaks corresponding to pauses or phrase boundaries.
The method may further include filtering the depersonalized audio features by removing depersonalized audio features longer than a certain length. Such filtering can include examining the content of the text segments and removing depersonalized audio features based on the content of the corresponding text segments. For example, the depersonalized audio features whose corresponding text segments contain a phone number and at least two more words may be removed.
In an embodiment, the method includes maintaining a store of the text segments and the corresponding depersonalized audio features. Maintaining the store can include storing each text segment together with its corresponding depersonalized audio feature and randomizing the text segments and corresponding depersonalized audio features.
In one embodiment, a system for enabling training of an acoustic model includes a shredding module configured to shred a speech corpus dynamically to produce text segments and depersonalized audio features corresponding to the text segments. The system further includes an enabling module configured to enable a system to train an acoustic model using the text segments and the depersonalized audio features.
Embodiments of the present invention have many advantages. Dynamically shredding the text and/or speech corpus, as described herein, results in a list of text segments, e.g., n-grams, and their associated depersonalized audio features (DAFs). The text segments and DAFs cannot be traced back to the original messages, since the original messages (text and audio) themselves are not retained, i.e., they are deleted. Furthermore, embodiments can prevent re-construction of the original messages, since all the text segments and corresponding DAFs (e.g., the shreds) can be randomized and aggregated across a large number of messages. In addition, embodiments allow for all other data from the original message (such as time of conversion, calling identifiers, etc.) to be deleted. What remains is a large collection of text segments (e.g., n-grams or n-tuples), with associated audio features, representing an aggregation of what has been said to the system. The collection of text segments (e.g., n-grams or n-tuples) and audio features can be maintained in a generic, impersonal form that is useful for training a speech recognition system to recognize future utterances. In certain embodiments, the resulting ASR statistics may contain no Personally Identifiable Information (PII).
The collection of ASR statistics is useful for (re-)training a speech recognition system that employs Language Models (LMs) and/or Acoustic Models (AMs). For example, when the original data cannot be retained, the ASR statistics can be used to retrain the ASR models (LM and AM). Benefits of using ASR statistic to (re-)train a speech recognition system include better accuracy of conversions, an ability to keep up to date with user trends in speech and usage, an ability to customize the speech recognition to the needs of the specific users, and a reduction of the volume of unconvertible messages.
The foregoing will be apparent from the following more particular description of example embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the present invention.
A description of example embodiments of the invention follows.
Training of speech recognition systems typically requires in-domain training data, but the in-domain data often contains personally identifiable information (PII) that cannot be retained due to data retention restrictions.
In general, the audio data 102 is captured or generated by a user of the speech recognition system 100, and may be considered an input to the speech recognition system. The metadata 110 relates to the audio data 102 and may be generated or used as part of the processing of the audio data 102 and may be provided to the speech recognition system 100. The metadata is usually delivered in addition to the audio itself. For example, the carrier will send the voice mail recording and at the same time (e.g., in an XML format) the number of the caller. This additional descriptive data, i.e., data about the actual audio data, is commonly referred to as metadata. In a dictation application, metadata can, for example, include the time of the dictation and the name of the user dictation. In the police interview case, metadata can, for example, include the participant(s) and the police case number and the like. The metadata 110 may be used in an embodiment to label the text corpus, segments of text and/or counts of the text segments with the metadata. The transcript data 120 typically relates to the output of the speech recognition system 100, for example, the presentation of the converted text to the user. In some cases, the transcript data 120 can include corrections of the automatic speech recognition output by a human operator/user or entirely manually created transcription. As shown in
As shown in
As shown in
Embodiments of the invention split the speech and/or text corpus up into smaller bits or shreds that are still usable for training while not containing personally identifiable information (PII). This process can be a compromise, because the smaller the bits or shreds, the less useful they are for training but the lower the risk of accidentally retaining any PII. By depersonalizing and/or filtering the data, e.g., the audio features and text segments, the systems and methods described herein can keep larger shreds while still removing PII.
Labeling the corpus and/or the text segments and counts with metadata is useful so that one can still train specific ASR models, or sub-models, after shredding. For example, if one wanted to train a model for the weekend, the metadata can be used to select, from the collection of shreds or segments, the shreds or segments from messages received during weekends.
Acoustic feature extraction results in a compression of the audio data. For example the audio data may be processed in frames where each frame includes a certain number of audio samples. In one example, the audio data is processed at 80 audio samples per frame. The acoustic feature extraction may result in 13 depersonalized audio features per frame. Since the original audio data are not retained, feature extraction results in a compression of audio data.
Acoustic feature extraction is a partially reversible process. The original audio cannot be re-created, but some form of audio can be created. The audio that can be created contains the same words as originally spoken, but, for example, without speaker-specific intonation.
In some embodiments, the audio features 210 are extracted from the speech corpus 302 and depersonalized. Depersonalization of the audio features may include applying cepstral mean subtraction (CMS), cepstral variance normalization, Gaussianisation, or vocal tract length normalization (VTLN) to the audio features. CMS is useful in removing an offset. For example, CMS can be used to remove a voice from a communication channel. VTLN is useful to normalize voices or voice data. It has been observed that female speakers typically have a shorter vocal tract than male speakers. VTLN can be used to normalize the voice data based on that observation.
Depersonalizing the audio features can include using a neural network to depersonalize the audio features. For example, a neural network system based on trainable features may be used, where the features which are trained to produce the posterior probability of the current frame of input (or a frame with a fixed offset to the current frame) correspond to one or more of a set of linguistic units including word and sub-word units, such as phone units, context-dependent phone units, grapheme units and the like. The depersonalized features can be a fixed linear or non-linear transform of the trainable features. The depersonalized features may be produced via an intermediate “bottleneck” layer created to produce features in trainable structures, such as multi-layer perceptrons, deep neural networks and deep belief networks. Furthermore, depersonalizing the audio features can include applying one or more (e.g., a set of) speaker-specific transforms to the audio features to remove speaker information. The types of speaker-specific transforms that may be used can include linear transforms, such as constrained maximum likelihood linear regression and variants, and speaker-specific non-linear transforms. An advantage of applying speaker-specific transforms is that the system can train for each speaker in the set (using any transform). The system can train the speaker characteristics in order to remove them to thereby depersonalize the audio features.
The collection or store 314 of strips or shreds 308 can be mixed up in a randomization of mixing operation 316. Each text segment 312 and the corresponding depersonalized audio feature 301 can be stored in a store and maintained for use by the system. For example, the text segments 312 and audio features 310 can be used to enable training of an acoustic model. The fact that the shreds 308 are in randomized order does not affect the training, because acoustic models for speech recognition relate to individual sounds. Maintaining the store can include storing each segment 312 together with its corresponding depersonalized audio feature 310, the text segments and corresponding depersonalized audio features being randomized, as shown at 318 in
In some embodiments, the method or system of enabling training of an acoustic model may further include filtering the depersonalized audio features, for example, by removing the depersonalized audio features that are longer than a certain length. Filtering the depersonalized audio features can include examining the content of the text segments and removing the personalized audio features based on the content of the corresponding text segments. In some embodiments, removing the depersonalized audio features includes removing the depersonalized audio features whose corresponding text segments contain a phone number and at least two more words.
It should be noted that the shredding process as described herein is a one-way only process. The original message, or message corpus, cannot be reconstructed from the shreds.
The number of occurrences of audio features for each text segment is an indication of how many times a particular text segment was spoken in a particular speech corpus. As described in reference to
As shown in
Optionally, filtering the DAFs can be combined with content identification. For example, filtering the DAFs can include examining content of the text segments and removing DAFs based on the content of the corresponding text segments. In the example shown in
The segments of text can be n-tuples (or n-grams) and may be non-overlapping segments of text. In some embodiments, the system 700 may be configured to maintain a store of the segments of text and the counts. As shown in
As shown in
The system 700 may be configured to maintain a list of the class labels and counts corresponding to the class labels, the list not being linked to the corpus. For example, the system may retain one or more general class membership frequency lists, which are stored separately and without link or reference to any individual message or document. The system 700 can maintain the counts of what has been replaced per class label. In the above example, maintaining the list would result in count(MaleFirstName,Uwe)+=1 and count(FemaleFirstName, Jill)+=1. But the system is not maintaining any link of where in the depersonalized corpus these instances came from. The system, however, keeps track of how common “Uwe” is as a MaleFirstName. In an embodiment, the depersonalization module 708 is configured to maintain the list of class labels.
The system 700 can further include a filtering module 710 configured to filter the segments of text by removing from the segments of text those segments that contain personally identifiable information. The system 700 may further include a labeling module 714 configured to label the text segments and the counts with metadata. The metadata can be leveraged to accumulate statistics per metadata value/cluster. For example, the system may track Count(Year=2012,WordTuple), where WordTuple denotes the text segment(s). The metadata may include at least one of the following: time of day of the message, area code of the sender, area code of the recipient, or call duration.
In an embodiment, the system 700 includes an indexing module 716 configured to replace one or more words of the corpus with corresponding one or more word indices, wherein each word index is generated by a random hash. Furthermore, the system, e.g., indexing module 716, may be configured to keep a map to the random hashes secure.
A system in accordance with the invention has been described which enables a system, e.g., a speech recognition system, to train a language model and/or an acoustic model. Components of such a system, for example a shredding module, segmentation module, enabling module and other systems discussed herein may, for example, be a portion of program code, operating on a computer processor.
Portions of the above-described embodiments of the present invention can be implemented using one or more computer systems, for example, to permit generation of ASR statistics for training of a language and/or an acoustic model. For example, the embodiments may be implemented using hardware, software or a combination thereof. When implemented in software, the software code can be stored on any form of non-transient computer-readable medium and loaded and executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers.
Further, it should be appreciated that a computer may be embodied in any of a number of forms, such as a rack-mounted computer, desktop computer, laptop computer, or tablet computer. Additionally, a computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a Personal Digital Assistant (PDA), a smart phone or any other suitable portable or fixed electronic device.
Also, a computer may have one or more input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include printers or display screens for visual presentation of output and speakers or other sound generating devices for audible presentation of output. Examples of input devices that can be used for a user interface include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets. As another example, a computer may receive input information through speech recognition or in other audible format.
Such computers may be interconnected by one or more networks in any suitable form, including as a local area network or a wide area network, such as an enterprise network or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks or fiber optic networks.
Also, the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine.
In this respect, at least a portion of the invention may be embodied as a computer readable medium (or multiple computer readable media) (e.g., a computer memory, one or more floppy discs, compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement the various embodiments of the invention discussed above. The computer readable medium or media can be transportable, such that the program or programs stored thereon can be loaded onto one or more different computers or other processors to implement various aspects of the present invention as discussed above.
In this respect, it should be appreciated that one implementation of the above-described embodiments comprises at least one computer-readable medium encoded with a computer program (e.g., a plurality of instructions), which, when executed on a processor, performs some or all of the above-described functions of these embodiments. As used herein, the term “computer-readable medium” encompasses only a non-transient computer-readable medium that can be considered to be a machine or a manufacture (i.e., article of manufacture). A computer-readable medium may be, for example, a tangible medium on which computer-readable information may be encoded or stored, a storage medium on which computer-readable information may be encoded or stored, and/or a non-transitory medium on which computer-readable information may be encoded or stored. Other non-exhaustive examples of computer-readable media include a computer memory (e.g., a ROM, RAM, flash memory, or other type of computer memory), magnetic disc or tape, optical disc, and/or other types of computer-readable media that can be considered to be a machine or a manufacture.
The terms “program” or “software” are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be employed to program a computer or other processor to implement various aspects of the present invention as discussed above. Additionally, it should be appreciated that according to one aspect of this embodiment, one or more computer programs that when executed perform methods of the present invention need not reside on a single computer or processor, but may be distributed in a modular fashion amongst a number of different computers or processors to implement various aspects of the present invention.
Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims. It should also be appreciated that the various technical features of the embodiments that have been described may be combined in various ways to produce numerous additional embodiments.
Ganong, III, William F., Jost, Uwe Helmut, Woodland, Philip Charles, Vozila, Paul J., Katz, Marcel, Shahid, Syed Raza
Patent | Priority | Assignee | Title |
10453447, | Nov 28 2017 | International Business Machines Corporation | Filtering data in an audio stream |
10540521, | Aug 24 2017 | International Business Machines Corporation | Selective enforcement of privacy and confidentiality for optimization of voice applications |
10956519, | Jun 29 2017 | Cisco Technology, Inc.; Cisco Technology, Inc | Fine-grained encrypted access to encrypted information |
11024295, | Nov 28 2017 | International Business Machines Corporation | Filtering data in an audio stream |
11069362, | Dec 07 2017 | InterDigital Madison Patent Holdings, SAS | Device and method for privacy-preserving vocal interaction |
11113419, | Aug 24 2017 | International Business Machines Corporation | Selective enforcement of privacy and confidentiality for optimization of voice applications |
11138334, | Oct 17 2018 | MEDALLIA, INC | Use of ASR confidence to improve reliability of automatic audio redaction |
11651139, | Jun 15 2021 | Nanjing Silicon Intelligence Technology Co., Ltd. | Text output method and system, storage medium, and electronic device |
11769496, | Dec 12 2019 | Amazon Technologies, Inc. | Predictive deletion of user input |
9934406, | Jan 08 2015 | Microsoft Technology Licensing, LLC | Protecting private information in input understanding system |
Patent | Priority | Assignee | Title |
6141753, | Feb 10 1998 | Thomson Licensing; Mitsubishi Corporation | Secure distribution of digital representations |
6404872, | Sep 25 1997 | AT&T Corp. | Method and apparatus for altering a speech signal during a telephone call |
6600814, | Sep 27 1999 | Unisys Corporation | Method, apparatus, and computer program product for reducing the load on a text-to-speech converter in a messaging system capable of text-to-speech conversion of e-mail documents |
6874085, | May 15 2000 | APRIMA MEDICAL SOFTWARE, INC | Medical records data security system |
7512583, | May 03 2005 | RPX Corporation | Trusted decision support system and method |
7526455, | May 03 2005 | RPX Corporation | Trusted decision support system and method |
7668718, | Aug 12 2005 | CUSTOM SPEECH USA, INC | Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile |
8185392, | Jul 13 2010 | GOOGLE LLC | Adapting enhanced acoustic models |
8229742, | Oct 21 2004 | DeliverHealth Solutions LLC | Transcription data security |
8401859, | Sep 01 2005 | XTONE, INC | Voice application network platform |
8423476, | Aug 31 1999 | Liberty Peak Ventures, LLC | Methods and apparatus for conducting electronic transactions |
8433658, | Aug 31 1999 | Liberty Peak Ventures, LLC | Methods and apparatus for conducting electronic transactions |
8473451, | Jul 30 2004 | Microsoft Technology Licensing, LLC | Preserving privacy in natural language databases |
8489513, | Aug 31 1999 | Liberty Peak Ventures, LLC | Methods and apparatus for conducting electronic transactions |
8515745, | Jun 20 2012 | GOOGLE LLC | Selecting speech data for speech recognition vocabulary |
8515895, | May 03 2005 | RPX Corporation | Trusted decision support system and method |
8561185, | May 17 2011 | GOOGLE LLC | Personally identifiable information detection |
8700396, | Sep 11 2012 | GOOGLE LLC | Generating speech data collection prompts |
9131369, | Jan 24 2013 | Microsoft Technology Licensing, LLC | Protection of private information in a client/server automatic speech recognition system |
20020023213, | |||
20030037250, | |||
20030172127, | |||
20050065950, | |||
20060085347, | |||
20060136259, | |||
20060149558, | |||
20060190263, | |||
20070118399, | |||
20070282592, | |||
20080086305, | |||
20080147412, | |||
20080209222, | |||
20080294435, | |||
20090132803, | |||
20100071041, | |||
20100242102, | |||
20100255953, | |||
20100281254, | |||
20110022835, | |||
20110054899, | |||
20110131138, | |||
20110197159, | |||
20120010887, | |||
20120011358, | |||
20120059653, | |||
20120079581, | |||
20120095923, | |||
20120101817, | |||
20120166186, | |||
20120201362, | |||
20120278061, | |||
20130073672, | |||
20130104251, | |||
20130243186, | |||
20130262873, | |||
20130263282, | |||
20130346066, | |||
20140058723, | |||
20140067738, | |||
20140143533, | |||
20140143550, | |||
20140163954, | |||
20140207442, | |||
20140278366, | |||
20140278425, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Feb 04 2013 | KATZ, MARCEL | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Feb 05 2013 | GANONG, WILLIAM F , III | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Feb 07 2013 | SHAHID, SYED RAZA | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Feb 26 2013 | JOST, UWE HELMUT | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Feb 26 2013 | VOZILA, PAUL J | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Mar 12 2013 | WOODLAND, PHILIP C | Nuance Communications, Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 029998 | /0142 | |
Mar 13 2013 | Nuance Communications, Inc. | (assignment on the face of the patent) | / | |||
Sep 20 2023 | Nuance Communications, Inc | Microsoft Technology Licensing, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 065532 | /0152 |
Date | Maintenance Fee Events |
Jun 03 2020 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
May 22 2024 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Dec 06 2019 | 4 years fee payment window open |
Jun 06 2020 | 6 months grace period start (w surcharge) |
Dec 06 2020 | patent expiry (for year 4) |
Dec 06 2022 | 2 years to revive unintentionally abandoned end. (for year 4) |
Dec 06 2023 | 8 years fee payment window open |
Jun 06 2024 | 6 months grace period start (w surcharge) |
Dec 06 2024 | patent expiry (for year 8) |
Dec 06 2026 | 2 years to revive unintentionally abandoned end. (for year 8) |
Dec 06 2027 | 12 years fee payment window open |
Jun 06 2028 | 6 months grace period start (w surcharge) |
Dec 06 2028 | patent expiry (for year 12) |
Dec 06 2030 | 2 years to revive unintentionally abandoned end. (for year 12) |