A computer implemented method which accesses multiple sets of narrative data, each set of narrative data comprising event records mapped to one of a plurality of story rules to create a plurality of story event sequences based on the event records and the story rules. The method compares a first story event sequence and a second story event sequence. The comparison comprises a method for determining a taxonomical distance between elements of narrative data in one set of narrative data to a corresponding element of narrative data in another set of narrative data; and performing an optimal matching between events in two sets of narrative data. The method outputs a similarity result between the sets of narrative data.
|
1. A machine implemented method, comprising:
accessing multiple sets of narrative data, each set of narrative data corresponding to a story and further comprising a set of threads into which a plurality of words of the story are reordered and event records mapped to one of a plurality of story rules to create a plurality of story event sequences based on the event records and the story rules;
comparing a first story event sequence and a second story event sequence, the comparing comprising at least one of determining a taxonomical distance between an element of narrative data in one set of narrative data to a corresponding element of narrative data in another set of narrative data; and performing an optimal matching between events in two sets of narrative data; and
outputting a similarity result displaying a similarity of the stories that correspond to the sets of narrative data.
9. A computer implemented method comparing at least two stories, comprising:
accessing first narrative data corresponding to a first story comprising a first sequence of event records, the event records ordered in a first story sequence;
accessing second narrative data corresponding to a second story comprising a second sequence of event records ordered in a second story sequence;
comparing the first sequence and the second sequence, the comparing comprising determining a taxonomical distance between a classification of the first story sequence to a classification of the second story sequence, and performing an optimal matching between events in the first story sequence and the second story sequence and
selecting one of the taxonomical distance determination and the optimal matching to provide a similarity result between the sets of narrative data; and
outputting the similarity result displaying a similarity of the first and second stories that correspond to the first and second narrative data.
16. A computing system, comprising:
a processor and a non-transitory storage medium code in the non-transitory storage medium, the code instructs the processor, the code including code configured to cause the processor to:
access multiple sets of narrative data, each set of narrative data corresponding to a story and further comprising event records mapped to one of a plurality of story rules to create a plurality of story event sequences based on the event records and the story rules;
comprising event records mapped to one of a plurality of story rules to create a plurality of story event sequences based on the event records and the story rules;
compare a first story event sequence and a second story event sequence, including at least one of calculate a taxonomical distance between an element of narrative data in one set of narrative data to a corresponding element of narrative data in another set of narrative data; and
calculate optimal matching between events in two sets of narrative data using semiotic operations of a semiotic square, including substituting opposing values in a semiotic square during the optimal matching; and
output a similarity result displaying a similarity of the stories based on the taxonomical distance and the optimal matching between the sets of narrative data.
2. The machine implemented method of
3. The machine implemented story of
5. The machine implemented method of
6. The machine implemented method of
7. The machine implemented method of
8. The machine implemented method of
10. The computer implemented method of
11. The computer implemented method of
12. The computer implemented method of
13. The computer implemented method of
14. The computer implemented method of
15. The computer implemented method of
17. The computing system of
18. The computing system of
19. The computing system of
20. The computing system of
|
Narratives may comprise written accounts of connected events. A story may include a narrative, actors or characters, and descriptive elements. Written stories are a primary way that many people consume news and entertainment. Many stories have a dynamic quality in that the story may change over time. News stories are updated as new related events occur, requiring news providers to gather new information, generate new text, and update any previous version of the story or author a completely new story.
Many people consume news stories in electronic form on various types of user devices. There is therefore a large availability of electronic narrative data which can be processed by computing devices. Because of the large amount of information available via public and private networks in electronic form, processing such data becomes increasingly difficult.
Linguistic computing presents a number of challenges in terms of defining relationships between complex story events.
The technology, briefly described, comprises a system and method for analyzing narrative data based to produce a comparison of stories, reflected as a sequence of story events, to detect similarities between one or more stores. The technology takes the form of a computer implemented method, a computing system and/or code configured to instruct a computing system to perform a plurality of computing steps to produce an analysis of input narrative data. The technology may comprise a computer implemented method which accesses multiple sets of narrative data, each set of narrative data comprising event records mapped to one of a plurality of story rules to create a plurality of story event sequences based on the event records and the story rules. The method compares a first story event sequence and a second story event sequence. The comparison comprises a method for determining a taxonomical distance between elements of narrative data in one set of narrative data to a corresponding element of narrative data in another set of narrative data; and performing an optimal matching between events in two sets of narrative data. The method outputs a similarity result between the sets of narrative data.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The technology discussed herein comprises a system and methods for analyzing narratives provided in electronic narrative data form utilizing a processing system. The technology uses patterns in the electronic narrative data and data structures incorporating data defined using semiotic squares to analyze the narrative data to analyze stories and elements of textual content, find similarities amongst various stories, predict events which may occur in the stories, and provide various types of analysis on the narrative data in a programmatic matter.
The technology herein decomposes each set of narrative data into a series of discrete events, each event having a specific data structure and elements, and maps the events onto stories. Events are created based on a semiotic ontology, which defines verbs in sentences as of a number of different functional types. Based on the defined functional ontology, sentences are broken down into cases and events. Events can then be utilized to provide the different types of analyses, including creating a timeline, making a prediction about the next event in the story, and finding similarities between in individual story and other stories based on their semiotic structures.
The technology herein is based on semiotic squares at multiple different levels. At a first level, the squares are used to analyze the ontology of transaction-based functional verbs. A data structure comprising a plurality of verbs utilized in a particular language includes the root form for the semiotic square, and any number of verbs which complete one or more semiotic squares for that particular form. This semiotic data structure can be utilized at various levels in the analysis. At another level, the system looks for patterns which can be used to analyze stories. At yet another level, substitutions from a semiotic square into various events are utilized to detect similarities and semiotic structures across various stories.
SPA represents nominal, positive action. A subject's energy is oriented towards the achievement of a goal. SPNA is the nominal, failure to achieve the action which is supposed to be initiated by its subject. The failure is realized in absentia and may be reflected as a lack or loss of resources, skills, or energy. SPB represents energetic, emotional, negative counter action (antagonism) relative to SPA, oriented against the action which is supposed to be initiated by its subject, or towards the achievement of the failure thereof. The counteraction is realized in praesentia: sabotage, boycott. SPNB represents an energetic, emotional, or positive action. The subject's energy is oriented towards an unrestricted, self-indulging achievement of the goal.
Note that SPA or “positive” action refers to the expected result, and might have a negative or regressive formulation. Likewise, the SPB or antagonism can have a positive flavor. For example, one semiotic square may be reflected as: SPA—damage control; SPNA—out of control; SPB—proliferation; and SPNB—elimination.
Deixis are fundamental dimensions of the semiotic square. There is one “positive” deixis, and one “negative” deixis. The deixis is a posture “for” and “against” to emphasize that the two “sides” of the basic semiotic square are exclusive and potentially argumentative. The deixis is not only a certain value and a certain orientation, it is also a statement which may be supportive or adverse. The deixis height is described by its orientation, and is modulated by its intensity.
The technology utilizes an application of the semiotic square to various narratives. The technology herein is based on the premise that communication is consubstantial to narratives, and using the core lexical semantics of a semiotic square results in a micro narrative for each term defined as the root of the semiotic square (SPA).
The semiotic square model allows core lexical semantics to be applied consistently the full vocabulary of the English language (or any other language). In
The two deixis, positive and negative, are symmetrical. The left (positive) deixis is generally based on a presupposition of states and gradation of intensity: excess directly presupposes assertion, and is a more intense state of the same. The negative deixis reflects a denial, and presupposes absence, and anorexia diet. Both deixis though are first and foremost defined by their argumentative stake: anorexia is more exactly defined as an inverted gluttony than as an excessive diet, and the reverse is true for anorexia vs. gluttony.
A swap of these two positions is a staple of plot reversals. For example, some stories reflect that a most hardened criminal can be converted into a good person. The principles are used in analysis of narrative data relative to stories in the present technology.
Narrative data 218 may comprise data input from various sources such as commercial new sources, user data, social networks, or any third party narrative textual data source. Multiple sets of data may be provided as a set of data. Narrative data may comprise a story drafted on a given date at a given time.
Each story comprises a narrative and cast, where each narrative is a pattern of events. Multiple stories based on multiple items of narrative data 218 may be analyzed, each story analysis building on any previous story analysis performed by computing environment 200. Various versions of a single story theme or subject may be provided.
Analyses performed by the computer environment 200 are provided to an output 260. The output 260 may comprise, in various environments, any form of user perceptible device including, but not limited to, a display device, another computing environment, a processing device, or a processing service, such as a story-based social network as described herein.
Parser 205 is configured to accept each of any number of sets of narrative data 218, reduce each of the words therein into individual tokens and parts of speech, and present the tokens parts of speech (POS) to the interpreter 212. Parser 205 is configured to evaluate English grammar patterns for phrases, clauses, and sentences, data for which his stored in a parser dictionary 215. A sentence pattern is a sequence combining clause patterns, phrase patterns, and parts of speech. The parser 205 breaks down sentence patterns to extract and classify parts of speech in incoming text in the narrative data 218.
Sentence rules 220 provide grammar patterns for phrases, clauses, and sentences, and are utilized by the parser 205 to break down the text narrative data 218 into usable data structures allowing computation and analysis as discussed herein.
The event interpreter 212 utilizes event rules 230, and semiotic square values and function maps data store 245 to classify functions and parts of speech into case structures (or frames), and later into event structures, which can then be mapped to story patterns.
The functions performed by the parser 205, event interpreter 212 and story interpreter 214 may be integrated into one or more software components run on a processing device similar to that illustrated in
The analysis engine 250 may perform various types of analyses on the data classified as events by the event interpreter. In one example, the story interpreter 214 orders a textual sequence from the narrative data 218 into a set of threads inside a story, sometimes referred to herein as a timeline.
Once one or more sets of narrative data 218 have been broken into discrete events by the event interpreter 212, each event may be stored as event data in the classified and mapped data store 280. The story interpreter 214 reads the classified and mapped data store 280 for event data to be mapped to stories. Events can be further processed by the analysis engine 250. Once the narrative data is broken into events, one can perform any number of different manipulations of the events, including but not limited to the creation of timelines, creating dynamically updating timelines (as new events occur), and predicting future events in timelines or stories.
Different types of analyses are provided by the analysis engine 250. In one type of analysis, analysis engine 250 can provide a timeline of discrete events, defined in terms of a number of events and the importance of the events. Still further, the analysis engine 250 can generate predictions using a prediction engine. A prediction of a next possible event in a story can be created using the techniques discussed herein. In addition, a similarity determination can be made between stories. For example, a current news story regarding the potential cheating of a sports organization's marquee quarterback can be compared and found similar to cheating scandals involving marquee players in other sports, or cheating scandals involving the same organization but involving different personnel.
Once the ontology is created at 305, at 307 for each set of narrative data 218, narrative data source text is retrieved at 310, and parsed at 320 to find tokens (verbs) and identify parts of speech. The tokens and parts of speech are utilized to build case frames at 350 which are stored as element records (
The method moves on to the next input data source 380. At 390, based on the events table, functional mapping and narrative mapping, an analysis of the story can be prepared. Various examples of story analysis are described herein. Each of these various steps as described further below.
With reference to step 305, in one embodiment, the method classifies words into an ontology including functions (verbs), actors (nouns) and shades (adjectives/adverbs). Each classification in this ontology describes an importance to various types of functions, actors, and shades. In accordance with the technology, a unique structure of property isotopies (physical, cognitive, and the like) is used to organize both entities and transformations, based on a functional mapping between transformations and properties.
For each classified word in the ontology, semiotic squares may be built. As used herein, a semiotic square for a word is provided in a data structure identifying for each root word those words related in a semiotic square. Some squares include all four terms of the square (SPA, SPB, SPNA, SPNB) while others may have only partial terms. Building a semiotic square for each word comprises creating a data structure including, for each word, between one and three additional terms completing the square. While conventional formalisms have language formalized as a first-order predicate logic in terms of dyadic antonyms, the semiotic square model adds two values to the dyadic antonyms and it adds a deictic perspective to the self-stated and context-free value of the initial pair.
Choices as to elements for each of the values of any square are defined in the semiotic square data store 245. In one example, data structures such as those illustrated in
In further consideration of step 305, ones of the squares defining verbs into various functional classifications are used interpretively. In accordance with the semiotic ontology, each verb in the language is classified into one of a number of functional types. Such functional types are classified as DO, and MAKE or TRANS. TRANS functions comprise three different types: TRANS-formation, TRANS-action, and TRANS-fer functions.
DO functions comprise all verbs that do not alter the state of a recipient (or actor), or do not involve an agent when they have a recipient (or actor).
A first type of DO verbs are the “state” and “activity”, “atelic”, verbs as defined in Vendler, Zeno (1957). “Verbs and times”. The Philosophical Review 66 (2): 143-160. Under Vendler's model, events may be classified into one of four aspectual classes: states, which are static and do not have an endpoint (“know,” “love”); activities, which are dynamic and do not have an endpoint (“run,” “drive”); accomplishments, which have an endpoint and are incremental or gradual (“paint a picture,” “build a house”); and achievements, which have an endpoint and occur instantaneously.
A second type DO verbs describe natural or physical changes: erupt, blossom, effervesce, etc. Although this second group of verbs generally do not have actors or agents, it is possible to consider semiotic square patterns for each of these type of verbs. For example, “flow” is a state of a liquid. A semiotic square for “flow” may be SPA: “Flow,” SPB: “Obstruct,” SPNA: “Trickle,” SPNB: “Pour.” Note that DO verbs can have an instrument or object, for example “Walk with a stick.”
All other verbs fall into the TRANS category comprising: TRANS-formation, TRANS-fer and TRANS-action. Each such verb has an actor (person) who is an agent. TRANSFERS and TRANSACTION have an animated recipient actor, and are typically ditransitive. Examples include: accord, afford, allocate, allow, and appoint. Each of these examples is a verb which takes a subject and two objects which refer to a theme and a recipient. These objects may be called direct and indirect, or primary and secondary.
Each type of function in the ontology allows for basic assumptions about the use of the verb to be utilized in computational narrative analysis.
TRANSFORMATIONS generally refer to a function of “to MAKE” and are therefore classified by the system as “MAKE” functions in a data structure mapping the terms to parts of speech. TRANSFORMATIONS are verbs of the form taken by objects, including their properties, and their formation (or changes to their state). A TRANSFORMATION is a function applied to a recipient to change its property from one state of property or intensity to another. The resulting state entails the pre-supposed pre-existing state, since MAKE changes the direction of a property to its opposite value.
A TRANSFORMATION function verb does not have both a donor and recipient which are animated. As such, either a sentence subject or a sentence object may be animated, but not both.
In a TRANSFORMATION clause, if the object O is animated, the subject S is not. For example, the statement “Cancer killed him” maps subject, verb and object as follows: S=‘cancer’, V=‘killed’, O=‘him’. Since ‘cancer’ is inanimate, it becomes an instrument of the transformation verb “killed” such that a mapping of the sentence by the parser (discussed below) may be {MAKE, object O=‘him’, ‘PHYSICAL_STATE’, ‘dead’ is the SPNA, the instrument INST=‘cancer’}.
Conversely, if the subject S is animated, the object O is not, this reflects a passive object and the subject becomes the instrument. Consider the exemplary sentence: “He died from cancer.” If there is an instrument (‘from cancer’), then there is a change of state. Since ‘he’ doesn't control the action, the object slot shifts such that a mapping of the sentence by the parser may be object O=‘he’, ‘PHYSICAL_STATE’, ‘dead’ is the SPNA, and the instrument INST=‘cancer.’ This mapping may take the form {MAKE, O=‘he’, (‘PHYSICAL_STATE’, ‘dead’, SPNA), INST=‘cancer’}.
This is in contrast to a sentence such as “Somebody killed him.” As discussed below, where S and O are animated entities, this is a contract and the function is a TRANSFER function. A mapping of a transfer function may take the form {XTRANSFER, S=‘somebody’, O=‘him’, (‘ATTACK’, SPNA)}. (An XTRANSFER function is a type of TRANSFER function, discussed below.)
Another type of TRANS verb is a TRANS-fer which deals with verbs that transfer an object without direct reciprocity between actors, as in “send” or “receive.” In a TRANSFER function, a contract between two actors is implicit (and required), and breaking the contract is a transgression.
TRANSFER verbs can be separated into abstract transfer (ATRANS), mental transfer (MTRANS), emotional transfer (ETRANS), physical transfer (PTRANS) and conflict transfer (XTRANSFER).
Abstract transfer functions comprise verbs denoting transfer of possession. Example verbs of this type include “get,” “give,” and “donate.” An exemplary semiotic square for “give” would be: SPA—“give/receive,” SPNA—“not give or not receive,” SPB—“deprive,” and SPNB—“spoil.”
For a given term “give” in
Mental transfer functions (MTRANS) comprise those functions denoting communication, or a mental or spiritual transaction. Examples of mental transfer functions include transfers of knowledge: “educate”, “convince”, “influence”. Examples further semiotic squares for “advise” or “listen” (as SPA) would include “not suggest” or “not listen”, “free mind”, and “exhortation.”
Physical transfer functions (PTRANS) are verbs denoting a travel or physical movement.
Emotional transfer functions (ETRANS) are verbs denoting a transfer of an emotional state. Examples include “like” and “love”.
The conflict group of transfer verbs (XTRANSFER) is often found in narratives, associated with typical actors: villain, victim, nemesis, ally, traitor, under cover, mediator, and witness. These functions also have typical and predictable prequels and sequels: transgress, attack, ask for/get help, retaliate, pursuit, retribution; and predictable sequences of events: attack, mediate, witness, retaliate, pursuit, retribution. Conflict transactions can include a twist: witness becomes victim, or a party is betrayed.
Another functional type are TRANS-actions. TRANSACTION functions have donor, recipient, and object. The actors or parties involved are generally more important than the object involved, and the object of the transaction is likely to stay unchanged other than the change in the ownership.
Most transactions are expressed by entailment: something that follows logically from or is implied by something else. For example: “If you take a drug, this drug has been administered to you.”
TRANSACTION functions can likewise be broken down into abstract (ATRANSACT) transaction functions, mental (MTRANSACT) transaction functions and conflict (XTRANSACT) transaction functions.
Included in the ontology are sentiments, emotions and sensations, collectively referred to herein as “shades.” Shades are generally introduced by adjectives and adverbs.
Adjectives used to express properties, and verbs used to express changes occurring to these properties, are generally equivalent. In transformation verb uses, the nuances of time or accomplishment in the different ways of expressing the change of state are generally irrelevant. The differences between these phrases “I'm angry” relative to the phrase “This angers me” are insubstantial from a story analysis perspective.
The ontology used in the present technology defines what can be considered to be a set of micro-narratives. As opposed to pairs of opposing words, the semiotic square lexicon includes, the basis of a micro-narrative based on a presumption of narrative balance. The narrative balance foundation states that, for almost any story, a “balance” exists between positive and negative elements or actions in the story. Consider for example a simple story where a villain starts out with a good degree of power and ultimately ends up being the subject of justice. Actions of the villain at the beginning of the story lead to repercussions for such actions at the end of the story.
The functional ontology—the classification of the verbs into functions—carries through the programmatic analysis of narrative data by allowing programmatic coupling of relationships in events to expand analysis of narrative data into various different types of outputs.
Each semiotic square may have a similar built in balance. Consider the square of “Flow,” “Obstruct,” “Trickle,” “Pour.” Where there is an obstruction, “flow” is prevented, but one may consider that at some point the obstruction will be removed and the flow restarted. The inherent balance in each semiotic square may be utilized in conjunction with the ontology as the basis for elements of the predictive analysis system herein.
This narrative balance is built into the constructs, described herein, which are utilized in the computations of story analysis.
Returning to
Finding inputs for event analysis is performed by the parser 205. The parser may comprise code adapted to evaluate English grammar patterns for phrases, clauses, and sentences. A sentence pattern is a sequence combining clause patterns, phrase patterns, and parts of speech. The interpreter breaks down sentence patterns to extract and classify parts of speech in incoming narrative data.
An exemplary sentence pattern is “c contrast_connective c” where “c” is a clause and “contrast_connective” is a connector between the clauses. Taking as an example the sentence “This hurts but I don't mind”, the sentence maps to: c: “this hurts”, contrast_connective: “but”, c: “I don't mind”. Note that sentence patterns are discourse patterns using concession, contrast, and the like. These connective patterns are carried along to other portions of the text evaluation to help the parser 205 to articulate the thread in a sentence of data.
A clause pattern breaks down into a noun part (np) and a verb part (vp) (clause pattern=np vp). For example “This sucks” maps to {np: “this”, vp: “sucks”} and “I don't mind” maps to {np: “I”, vp: “don't mind”}. This mapping may be stored in the mapped and classified data store 280.
The clause pattern may be further broken down into a phrase pattern of a proper noun (pn). The phrase pattern may simple be the nouns, where, for example, “this” maps to {pn: “this”} and “I” maps to {pn: “I”}.
Each level of pattern (sentence, clause, phrase) is evaluated using a set of rules describing each possible pattern at that level Exemplary sentence rules may take the form:
(s, [c]),
(s, [c, break?, contrast_connective, c]),
(s, [contrast_connective, c, break?, c])
....
The above sentence rules show three examples of a set of sentence rules which may run into the thousands. Clause rules take the form:
(c, [vp, pp?]), (c, [aux, np])
.....
Similarly, while only one clause rule is illustrated above, clause rules can run into the thousands for various presentations of clauses available in the English language. In the above clause rule example, the question mark allows the parser to skip optional items. “Soft” breaks may close a clause, although they can be explicitly stated as mandatory or optional.
An exemplary phrase rule takes the form:
(np, [det, adj, n]), (np, [pn])
....
Again, there may be hundreds or thousands of phrase rules defined for each computing environment 200. Sentence rules are stored in a sentence rules data store 220, while clause and phrase rules may be stored in the parser dictionary data store 215. It should be recognized that these databases can be combined.
The parser 205 will match the word values to a parser dictionary data store 215 which includes a definition for each word, the definition classifying the word as its potential usage form in sentences. Such lists can be updated dynamically as needed as meanings of words change over time.
The result of the parsing performed at 320 is a set of tokens and parts of speech which are used by interpreter 212 to classify the mapped data in data store 280.
Returning to
A case frame is a mapping of the token and part of speech into one or more context maps. Each case frame is a map of the sentence data to a subject, an object, circumstances (i.e. time, location, instrument), shades (manner and tone), and an excerpt. Each mapped case frame may comprise a case frame record identifying the respective words in a sentence to their constituent portion of the case frame.
The event interpreter loops though steps 410, 420, 425 and 450 to attempt matching of tokens and parts of speech combinations to case frame patterns. At 410, pattern items are demonstrated from left to right in a sentence of text data, following case frame rules and using a syntactic unification of first-order terms. The unified terms of the first successful solution are returned as the dictionary of binding results (unifiers in solution) at 420 and the case frame is selected at 460 and stored at 470.
Additional rules map the interpreter bindings to the case frame at 460, where each case frame may include: SUBJECT (agent), FUNCTION, OBJECT, and RECIPIENT at 460. Additional rules include, for a subject:
Specific rules may be provided to handle multiple clauses, interpreted according to their connective pattern: sequential clauses, prepositional clauses, relative clauses, introducing complementary events into the timeline, and subordinate clauses, flattened when introduced by verbs of “thinking” or “saying”.
At 460, the case frame is selected based on the first order matching performed at 420, the data stored in the case frame at 470 (in, for example, data store 280). If no good match exists at 420, the case rules are sorted, and if an intermediate binding exists at 425, the intermediate bindings are stored at 450 in a data structure stack to allow backtracking and optimization of the event interpreter 212. Any remaining list of input tokens is carried across sequential terms as the method loops back to 410, where mapping continues looping from 480 to 410 for all tokens and POS until all tokens and POS patterns are completed
Again returning to
A method for mapping case frames to events is discussed below with respect to
Event_id
Cast
Actor
Agent
Actor
Agent
Narrative
Function
Circumstances: time, location, instrument, etc.
Shades: manner, tone, etc.
Excerpt
If a match between a case frame and an event frame occurs at 530, then a mapping occurs at 540 and an event record is stored at 550. If no match occurs at 530, then the method continues until a mapping between a case frame and event frame occurs.
Property (PRP) attribution by copula applies to clauses (or case frames) which have a subject, verb, and complement, and where, in general, a complement comprises a property relative to the subject. Hence, the attribution is the description provided by the adjective of the subject. A PRP mapping defines properties acquired by the target through PRP attribution, beneficial or detrimental.
In one embodiment, DO and TRANSFORM functions (i.e. MAKE functions) may be skipped at succeeding levels of interpretation. In other words, detection of a DO or TRANSFORM verb is stored, but not used in analysis in a story mapping analysis. Most MAKE functions are descriptive of occurrences in sentence data. However, some MAKE functions may be important to a realization of a TRANSACTION or a TRANSFER. When a MAKE function is unrelated to any TRANSACTION or TRANSFER event, it may also discarded. However, the technology categorizes and stores DO and TRANSFORM functions as they may be utilized in other analysis embodiments of the technology.
TRANSFER functions are most frequently found in a story narrative and imply asymmetrical transfer, such as abuse, mischief, and deceit. For TRANSFER functions, the cast of actors involved is mapped to fit the semiotic square value of the TRANSFER function. For example, the agent of a detrimental transfer like “abuse” is one of a “villain” or “trickster”. The recipient of a detrimental transfer like “abuse” is the “victim.”
TRANSACTION functions define a symmetrical event, typically found in the climax event. Examples of a symmetrical event denote symmetry between the actors, such as a conspiracy or alliance. The cast of actors involved is adjusted to fit the semiotic square value of the function. In addition, co-agents of a symmetrical transaction like “conspiracy” may be included.
Once events are mapped and stored in event frames, the result of step 360 is a plurality of event records for each item or set of narrative data. The event records may be manipulated in a variety of ways. In one embodiment, events are mapped to a narrative context by the story interpreter, and can be manipulated to produce one of several types of story analyses as indicated at 370 in
Timeline
[Event_id, Event_id ...]
Narrative
Name
List of eventful functions
Cast
List of eventful actors
When performing a story analysis, from event to story, a story interpreter 214 may operate to reorder a textual sequence into a set of threads inside the same narrative. The story rules 240 provide heuristics to identify and reorder the threads an exemplary story rule defined as follows: [(“villain punished”, [(villain, abuse, victim), (victim, ask for help), (hero, contract, victim), (hero, punish, villain)]
The system may include thousands of story rules based on implicit story patterns defined for the story interpreter 214. As noted above, each story comprises a narrative and a cast. Each narrative includes pattern of events (“love story”, “natural disaster”) and has a narrative structure: sequences, parallel sequences, and cycles. As noted above, each event in a story comprises a function, actors, circumstances, and shades. TRANS functions serve as the basis for story analysis. Actors may be characterized in terms of their role within the story, such as, for example, a protagonist, an ally, a villain, a witness, a victim, or the like. Circumstances define, for example, time, location, and settings. Time is important to events as it is used to generate a timeline of events in one of the one of the analysis outputs discussed below. Shades add a descriptive nature to each of the events. Stories may also include sub-stories, or sub-plots, collectively referred to as episodes. Each episode comprises a sub pattern of events inside the main narrative. Some subplots are classified as disruptive, where, for example, they may comprise a “red herring” to the story.
At step 650, the story interpreter matches cast and events to story patterns. A story pattern is a sequence of events associated to a thematic name.
Some events and roles will be generated as probable by entailment rules. For example, the existence of a cadaver implies that there is a killing and some actor is a killer. In creating story timelines, the “protagonist” that is the focus of each thread in a story can be used to trace and to chain events across a timeline, as illustrated in
If the story patterns match, then the event ID fields are matched to the story pattern fields, at 660. If not, the method continues at 650 until the story pattern is matched.
The result of the foregoing process is to provide a sequence of story events which in one embodiment may comprise a machine generated story, created from multiple sets of narrative data taken as input. The method may be repeated for any new set of narrative data, thereby allowing dynamic updating of a machine generated story for each new set of narrative data taken as input to the method.
As noted above, each set of events can be utilized to create different types of outputs.
In another unique aspect of the technology, story rules may be predicated on narrative balance, and in particular narrative balance with respect to each actor in a story.
Two forms of balance found in any narrative are transgression and retribution. If there is a transgression, there will also be a retribution. Between the two negative forms of prohibition and violation, the second one is clearly the most decisive. Any contract will be violated, and any rule will be transgressed. The only counterweight to transgression will be the result of thymic functions, and/or adjustments of degrees of axiological opposition between moral standards.
The semiotic square of functions helps to expand the analysis of the function by exploring opposite, reciprocal and gradual relationships. Typically a retribution can then be chained with the start of a negative transaction: if A attacks B, then the reciprocal expected event is a retribution by B towards A. (
Each actor triggers his plan according to his own point of view. In a wrongdoing/retribution pattern, the act object of retribution has only to be considered to be an incitement to a new wrongdoing to obtain a narrative cycle. The two associated sequences then correspond to two complementary points of view of the same action: i.e. redress for the aggressor and wrongdoing for the victim, alternately attributed to the two protagonists, with the wrongdoing becoming a damage to be redressed.
The dynamics of the archetypal narrative works around a double inversion. This canonical sequence chains three phases: opening, core, and closing. The opening of a sequence involves preparation of a resource which is the object that will receive the sequence core, or the object of the sender controlling the sequence. The core of the sequence involves the execution of events and transformations that justify the overall significance given to the sequence (potentially reflected in the name of the sequence). The closing of the sequence which may be actual or virtual, restores the initial state of balance freeze the resources used in the opening of the sequence, replaces protagonists in the environment.
An example of an axiological ontology adding another layer of complexity to the contractual square is the articulation of secret and illusion, stemming from a difference of degree between two levels of (recognition of) reality. Access to the ontology of this technology allows one to distort the terms of the contract: keeps the terms secret, lie about the terms, give false commitments, etc. These distortions will be repaid later in the narrative, unmasking the real intents of the protagonists and leading to subsequent plot twists. Thymic and axiological overlays on contractual transactions carry aspectual (occurrence, duration, frequency) and intensity values, both varying over time and circumstances. Again, these inflections will generate emotional shifts and lead to new developments. An axiological ontology can be reflected in the semiotic square such as that illustrated in
At 380, based on the events mapping, functional mapping, and narrative mapping, an analysis of the story may be created and directed to output 260. In the above, any of a number of different types of outputs 260 can consume the analysis of the story. As noted above, analysis engine 250 may provide a number of analyses to any number of outputs. The analyses include, but are not limited to: generating a timeline of events for a display and/or consumption by a story-based social network; generating a prediction of a next event in a timeline; generating a believability score and generating a comparison of a story to other stories.
As noted above, each record of an event includes an EVENTID and circumstances including time, and location. Events have been mapped to story rules, allowing classification of the stories as a series of events. At 710, for each event in a story relative to a story mapping, the event is ordered relative to time in an ordered list. As noted above, protagonists can be utilized to map story threads through the story timeline. At 720, a selection of two or more events may be made. Because stories may have a significant number of events, a timeline may be programmatically limited to a specified number of events, the importance of which can be calculated based on the relative importance of the function, the actor and the objects involved. At 750, a determination is made as to whether any one event is relative to a pre-existing timeline or whether this event is related to a new timeline. If the event is not relevant to a pre-existing timeline then a new event is added to a new timeline at 795. If the sequence can be related to a particular pre-existing story, then a determination is made as to whether not the new event supersedes the previous event at 760. If the event does not supersede a previous event, then the event is added to the timeline at 790. If event does supersede the previous event, then at 780 the previous event is removed from the previous timeline, and the new event added to the previous timeline at 790. If neither 750 nor 760 are true, the event is added to the timeline and the next event is retrieved at 710.
Another type of analysis that may be performed at 370 is story event prediction.
A first predictive method includes analysis of event sequences using a statistical method, such as a Markov chain of events. Within a narrative context, the Markov chain analyzes a sequence of random variables (in this case events) with a property defining a serial dependence only between adjacent portions of the chain. As stories generally follow a chain of linked events, Markov analysis can be applied at the event level (using the semiotic square of elements in the event frame) to determine a predictable next event in a sequence, in this case a story.
A second prediction method is based on narrative balance. The constant narrative balance has been discussed above. For any story, and for any protagonist (in this case the subject of the prediction), a narrative balance analysis evaluates the number of transgressions and retaliations in a given story chain at a given point in the story. Transgressions and retaliations can be determined by, for example, mapping for each TRANS verb an indication of whether the verb identifies an action associated with a transgression or retaliation and identifying event mappings which indicate transgression and retaliation events on a per protagonist basis. Next one determines whether or not the number of transgression events is greater than the number of retaliation events for a given protagonist. If the number of transgression events is greater than the number of retaliation events for a given protagonist, the protagonist is likely the subject of a retaliation event.
This characterization is performed for each protagonist such that there are no “villains” and “heroes”—there are only actors. If one protagonist/actor, on balance, has more transgressions than retaliations relative to another actor, one can predict what the next action relative to each actor will be based on narrative balance. The actor with more transgressions is likely to retaliate against the actor who has committed the transgressions against him/her.
Narrative balance prediction can be further enhanced by seeking shift precursors: protagonists with a negative prequel will looking forward to having positive sequel. This extends the narrative balance prediction methodology to overall story analysis.
Another method for prediction of a next event is to examine protagonist behavioral patterns, learned from similar situations. For any given protagonist in a story, all events involving a given protagonist can be evaluated and examined for semiotic similarities. If an event in a story indicates that an actor will kill when being chased by another actor, a similarity analysis will equate a likelihood of a next action to predict that the same actor will likely kill when being pursued by a different actor, or even that the act of killing can be extended to other elements of the semiotic square defined for the element of “chase.”
Returning
At 850 a determination is made as to whether or not another method should be utilized to create a prediction. In one embodiment, step 850 may be skipped and only one predictive method will be utilized. In another embodiment, all methods can be utilized. The determination 850 may be a programmatic selection configured into the computing environment 200. Once one or more predictions have been made, this prediction is selected at 860.
In another embodiment, each of a number of predictive methods may be performed to provide at least as many predictive events as available methods and may further include performing semiotic substitution on one or more events in the story to produce additional predictive events based on each of the aforementioned methods. In a further embodiment, all such predictions in the form of predictive events are returned to the output.
In the example of
Clustering generally groups a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. Clustering may be performed by any number of different algorithms. In the present technology, connectivity based clustering may be utilized to determine relationships between events based on a distance measure between the events when organized into a cluster. Any number of different algorithms connect events to form “clusters” based on their distance. At different distances, different clusters will form, which can be represented as a hierarchy of clusters that merge with each other at certain distances.
Connectivity based clustering—based on distance—can be used to generate, for any event, a believability score. The believability score can be created based on the relative distance between two events in a cluster, with one of the events constituting a prototype (cluster center) from which the believability score is measured. The believability score can be added to any predictive analysis as part of an output (timeline or story) to reflect a level of confidence (or believability) in the projection.
For predictability analysis, clustering distances can be used after filtering for threshold distances. Event clusters can be defined as areas of higher density than the remainder of the data set. Objects in these sparse areas—that are required to separate clusters—are usually considered to be noise and border points.
Thus, only events at points within certain distance thresholds or densities may be considered where such events satisfy a density criterion defined as a minimum number of other events within a given cluster radius.
In
At 910, predicting story similarity may be based on abstracting the story structure based on the aforementioned story rules, and for each element in the story rule, determining taxonomical distance between that element and another element a respective taxonomy. Two stories may be considered similar if their narratives and/or cast are similar. Narrative similarity may be a combination of: distance in narratives taxonomy and optimal matching between events in narratives. Cast similarity is a function of a distance in an actors taxonomy.
A number of different taxonomies may be provided to classify different types of topics in a story.
To compute similarities between a given narrative and other narratives, in a first embodiment, a linear distance between the respective elements in a narrative to other elements in the narrative being compared thereto. In one embodiment, this may be the linear calculation of a distance between two elements in a given taxonomy. For example, if a story has been classified as a destiny genre (illustrated in
A second methodology which may be used alone or in combination with the linear distance calculation is that of an optimal matching at 920. Optimal matching attempts to provide, for each of the elements in the narrative, a determination of the relationship to a corresponding element and another narrative. Optimal matching is a sequence analysis method to assess the dissimilarity of ordered arrays of tokens (herein, events,) that usually represent a time-ordered sequence of states. Once such distances have been calculated for a set of observations (e.g. individuals in a cohort) classical tools (such as cluster analysis) can be used. OMA is a family of procedures that takes into account the full complexity of sequence data. The objective is to identify a typology of sequences empirically. The core of OMA is a two-step procedure. First, given a set of sequences, find the distance between every pair of sequences though an iterative minimization procedure. This will give a distance matrix for all the sequences. Secondly, apply clustering or cognate procedures to the distance matrix to ascertain if the sequences fall into distinct types. The objective of this first step is to find, for each pair of sequences in the sample, the lowest ‘costs’ needed to turn one sequence into another, using three elementary operations: insertion, deletion, and substitution—such as substitution using the semiotic square defined herein.
Utilizing the ontology discussed herein, as well as the semiotic square of each element in a narrative, one can compute the number of similar elements, narrative by narrative, and determine an optimal matching value for each event. Optimal matching between two different narratives can produced an optimal matching value score, and the score may be compared with the taxonomical value distance, or combined with the taxonomic value distance, to achieve a similarity score between two different narratives. Multiple narratives can be compared and ranked based on this similarity score. In another aspect, if a user promotes a particular story as being similar to another story, this can be factored into the similarity index.
At various points in the aforementioned disclosure, discussion of using the semiotic square substitution has been described.
Semiotic square substitution at 930 for each of a number of moments in a narrative can be utilized at multiple levels in the context of the present technology.
Any one or more of the above methods 910, 920, 930 may be used and a determination made at 940 to select the similarity result of the calculation.
Social network computing environment 2410 includes a friend database 2420, application server 2435 and one or more analysis engines 2415. Each analysis engine includes a story processor creating a curated story in accordance with the description herein. The story processor takes as input sets of narrative data 2405 which comprises text inputs which are provided to the story generator. In addition, other data sources 2432 can provide data to the analysis engine and story processor. Other data sources may include, for example, social networks such as Twitter and Facebook, weblogs or “blogs”, commercial new sources such as CNN and MSNBC, or any electronically available narrative data source which may be accessed directly or indirectly using programmatic means. In a further embodiment, the sets of narrative data 2405 may be user-selected. In one embodiment, an interface presenting a set of narrative data encompassing a variety of different input stories (such as news articles) is provided to a user of social network computing environment 2410 via user device 2450. The computing environment 2410 receives a user-selection of one or more input stories as input sets of narrative data 2405, and the analysis is performed based on the user-selected set of input data.
One or more client devices 2425 may be coupled to the social network computing device 2410 via one or more public or private networks, such as the Internet. Likewise, a user device 2450 may also be coupled to the social network computing environment 2410 via one or more public and private networks. Each user device 2450 and client device 2425 may be a separate processing device, or may be considered an application executing on a processing device which provides access to the output of the social network computing environment 2410. An example of such an application would be a rub browser, or application server 2435 outputs a webpage to the client device 2425 or user device 2450.
As illustrated in
As illustrated on user device 2450, a user device may include user modifications, an event timeline, and friend input is further illustrated in
Moreover, the computing system 3702 includes a RAM 3720 and a non-volatile storage 3730 that can communicate with each, and processor 3710, other via a bus 3708. Illustrated in the non-volatile storage 3730 are components including a parser, analysis engine, event interpreter and story interpreter as discussed herein.
As shown, the computing system 3702 may further include a display unit 3750, such as a liquid crystal display (LCD), an organic light emitting diode (OLED), a flat panel display, a solid state display, or a cathode ray tube (CRT). Additionally, the imaging processor may include an input device 3760, such as a keyboard/virtual keyboard or touch-sensitive input screen or speech input with speech recognition, and which may include a cursor control device, such as a mouse or touch-sensitive input screen or pad. A network interface 3740 may be a wired or wireless network interface allowing the system 3702 to communicate with other devices in the manner discussed herein, via public, private, wireless and wired networks.
In the embodiment illustrated in
Memories described herein are tangible storage mediums that can store data and executable instructions, and are non-transitory during the time instructions are stored therein. A memory described herein is an article of manufacture and/or machine component. Memories will described herein are computer-readable mediums from which data and executable instructions can be read by a computer. Memories as described herein may be random access memory (RAM), read only memory (ROM), flash memory, electrically programmable read only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, a hard disk, a removable disk, tape, compact disk read only memory (CD-ROM), digital versatile disk (DVD), floppy disk, Blu-ray disk, or any other form of storage medium known in the art. Memories may be volatile or non-volatile, secure and/or encrypted, unsecure and/or unencrypted.
Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatuses (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable instruction execution apparatus, create a mechanism for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that when executed can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions when stored in the computer readable medium produce an article of manufacture including instructions which when executed, cause a computer to implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer, other programmable instruction execution apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatuses or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Patent | Priority | Assignee | Title |
11093706, | Mar 25 2016 | RAFTR, INC | Protagonist narrative balance computer implemented analysis of narrative data |
Patent | Priority | Assignee | Title |
5963965, | Feb 18 1997 | AMOBEE, INC | Text processing and retrieval system and method |
6665681, | Apr 09 1999 | AMOBEE, INC | System and method for generating a taxonomy from a plurality of documents |
7113954, | Apr 09 1999 | AMOBEE, INC | System and method for generating a taxonomy from a plurality of documents |
7546278, | Mar 13 2006 | Microsoft Technology Licensing, LLC | Correlating categories using taxonomy distance and term space distance |
7912701, | May 04 2005 | VIRTUALSYNAPTICS, INC | Method and apparatus for semiotic correlation |
8166032, | Apr 09 2009 | Marketchorus, Inc. | System and method for sentiment-based text classification and relevancy ranking |
8280827, | Aug 23 2005 | SYNEOLA SA | Multilevel semiotic and fuzzy logic user and metadata interface means for interactive multimedia system having cognitive adaptive capability |
8280885, | Oct 29 2007 | Cornell University | System and method for automatically summarizing fine-grained opinions in digital text |
8327265, | Apr 09 1999 | AMOBEE, INC | System and method for parsing a document |
20010003099, | |||
20020178185, | |||
20040148155, | |||
20040220893, | |||
20050086218, | |||
20070100601, | |||
20070162465, | |||
20090119157, | |||
20090132441, | |||
20090248399, | |||
20090276426, | |||
20090282019, | |||
20090299926, | |||
20110319148, | |||
20120265745, | |||
20120330869, | |||
20140108006, | |||
20150142434, | |||
20150193583, | |||
20160063085, | |||
20160196587, | |||
20170173466, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 23 2016 | DECKER, SUSAN | TRIPLEDIP, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 038589 | /0939 | |
Mar 24 2016 | VOGEL, CLAUDE | TRIPLEDIP, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 038589 | /0939 | |
Mar 25 2016 | RAFTR, INC. | (assignment on the face of the patent) | / | |||
Mar 23 2018 | TRIPLEDIP, LLC | RAFTR, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 045353 | /0452 |
Date | Maintenance Fee Events |
Jun 26 2023 | REM: Maintenance Fee Reminder Mailed. |
Dec 11 2023 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Nov 05 2022 | 4 years fee payment window open |
May 05 2023 | 6 months grace period start (w surcharge) |
Nov 05 2023 | patent expiry (for year 4) |
Nov 05 2025 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 05 2026 | 8 years fee payment window open |
May 05 2027 | 6 months grace period start (w surcharge) |
Nov 05 2027 | patent expiry (for year 8) |
Nov 05 2029 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 05 2030 | 12 years fee payment window open |
May 05 2031 | 6 months grace period start (w surcharge) |
Nov 05 2031 | patent expiry (for year 12) |
Nov 05 2033 | 2 years to revive unintentionally abandoned end. (for year 12) |