Good metadata offers a variety of benefits, opening up new possibilities for connecting with readers and managing content:
Get a quick introduction to the AP Metadata Services (PDF).
Solutions and Use Cases
Find out how you can take advantage of AP metadata to enrich your news content (PDF).
Review the complete documentation of API methods (PDF).
AP News Taxonomy includes standardized subjects, planned and breaking news events, people, organizations, geographic locations and more, all designed with news content in mind. Frequent updates ensure timeliness, and a rich network of semantic relationships between concepts enables creative solutions for content linking, search and discovery.
Integrate the AP News Taxonomy into your publishing systems to support manual tagging, or use it in conjunction with the AP Tagging service to apply the taxonomy to your content.
In addition to standardized terminology and unique IDs, the taxonomy stores a variety of details about the people, places, and topics it contains.
This information can power enhanced search experiences, browsing and discovery, or informational displays; for example:
There are five main areas of coverage:
A wide variety of hierarchically structured topics ranging from broad categories (Crime) to specific concepts (Illegal firearms). Also includes many named events such as Academy Awards and Tour de France. More than 4,200 terms in all.
Over 2,500 geographic place names arranged hierarchically – continents, world regions, countries, territories, national capitals, major world cities, US states, Canadian provinces, and a large number of US cities and towns.
Organizations and institutions from a wide variety of sectors: government organizations, non-profits, sports teams, colleges and universities, political and ideological groups, cultural institutions, and more. Over 2,500 different terms.
Celebrities, artists, designers, authors, business leaders, political figures, sports figures, royalty, and other newsmakers known at the global or US national level. Coverage is especially broad for US newsmakers in politics, entertainment and sports, including complete rosters for major professional sports teams, men’s NCAA Division I basketball and football players, all US officeholders at the federal and gubernatorial levels, and all candidates for those offices. More than 160,000 individuals covered.
Over 70,000 publicly-traded companies – including all companies with primary shares trading on any of 70 major global stock exchanges, or trading as ADRs on an American exchange.
The AP News Taxonomy is constantly being updated to capture the latest news and the biggest newsmakers. Whether it’s this week’s IPOs or the new crop of college athletes, AP’s taxonomy developers are always working to keep the vocabularies current and relevant.
Subscribers are kept up-to-date in real time – as soon as a change is published, the new version becomes available in the AP News Taxonomy. Numerical versioning keeps the changes organized and in sync with the data provided by the AP Tagging Service. A detailed log of all changes is accessible through a separate API. You can keep track of all changes, or just the ones you care most about.
Subscribers can access the taxonomy by making calls to an API and request the full set of terms in a given vocabulary, a subset of terms, or information about a particular term. Calls are also available for retrieving deprecated terms, term change logs, and additional information about the structure of the taxonomy. Taxonomy data can be returned in Semantic-Web compatible formats, such as RDF/XML or RDF/TTL. Information about a particular term can also be returned as HTML.
To learn more, refer to the Developer's Guide.
AP Tagging service receives your English-language news content and automatically returns relevant metadata, using standardized terminology from the AP News Taxonomy.
This smart service goes well beyond mere text extraction; it uses human-created semantic rules to understand your content and identify the most pertinent concepts and topics.
Human-managed rules allow for more precise control over the service performance.
Although the system will recognize and return specific entities that it finds in the submitted content (aka "text extraction"), it also uses human-created semantic rules to identify topics that may not be explicitly mentioned in the text at all.
For example, a story about a particular country music star can trigger the "Country music" subject, even if the word "music" does not appear in the story.
Drawing from the AP News Taxonomy, the tagging service looks at each piece of submitted content and returns standardized names, IDs and other properties for all the relevant taxonomy terms that triggered a semantic rule. You can specify which of the five areas of coverage the service should return.
After all the matching metadata values have been identified, the service checks for additional standardized names and IDs based on relationships stored in the AP News Taxonomy. For example, the subject hierarchy ensures that any item tagged with "Food safety" is also tagged with "Health", and any content that picks up a sports league is tagged with the relevant sport subject.
Finally, the metadata output is enhanced with additional data properties. For example, companies are given a ticker, athletes are associated with their teams, and geographic locations get latitude and longitude data. You can also access the AP News Taxonomy for additional information about any given tag.
Subscribers can access the tagging service by making calls to an API. Content may be submitted as plain text or XML.
Tagging service data can be returned in a Semantic-Web compatible formats, such as RDF/XML or RDF/TTL. It can also be returned in Simple XML format.
To learn more, refer to the Developer's Guide.