AP Metadata Services
Add value to your news content with AP’s industry-leading metadata -- accurate, comprehensive, richly detailed data, designed specifically for use by news publishers. AP Metadata Services is a new set of APIs that gives you direct access to the same metadata system that supports AP’s award-winning, global news operation.There are two components:
- AP News Taxonomy
A comprehensive classification system, including standardized subjects, people, organizations, geographic locations and more, all designed with news content in mind. Frequent updates ensure timeliness, and a rich network of semantic relationships between concepts enables creative solutions for content linking, search and discovery.
- AP Tagging Service
Receives your news content and automatically enriches it with all the relevant metadata tags from the News Taxonomy. This smart service goes well beyond mere text extraction; it uses human-created semantic rules to understand your content and identify the most pertinent concepts and topics.
- Deliver targeted, relevant news products based on particular topics.
- Create engaging search and discovery experiences for your readers.
- Support contextual advertising.
- Use detailed content analytics to inform editorial coverage and planning.
AP News Taxonomy
The AP News Taxonomy is a comprehensive set of standardized vocabularies for describing English-language news content. Terms in the vocabularies cover all aspects of news: subjects, people, places, organizations, and more. When you submit content to the automated AP Tagging Service, the data that comes back is drawn from these vocabularies. Publishers may also choose to integrate the AP News Taxonomy into their own publishing systems to support manual tagging. In addition to standardized terminology and unique IDs, the taxonomy stores a variety of details about the people, places, and topics it contains. This information can power enhanced search experiences, browsing and discovery, or informational displays. For example:
- Synonyms, acronyms, and spelling variants.
- Properties of people, places and things – such as an athlete’s uniform number, the latitude and longitude of a geographic location, or the stock ticker symbol for a company.
- Relationships between concepts, such as between two people (Parent-Child), or between a person and a group or organization (Athlete-Team).
- Hierarchical structure for subjects and geographic locations, to enable both broad and narrow searches.
What is included in the taxonomy?There are five main areas of coverage:
- AP Subject
A wide variety of hierarchically structured topics ranging from broad categories (Crime) to specific concepts (Illegal firearms). Also includes many named events such as Academy Awards and Tour de France. More than 4,200 terms in all.
- AP Geography
Over 2,200 geographic place names arranged hierarchically – continents, world regions, countries, territories, national capitals, major world cities, US states, Canadian provinces, and a large number of US cities and towns.
- AP Organization
Organizations and institutions from a wide variety of sectors: government organizations, non-profits, sports teams, colleges and universities, political and ideological groups, cultural institutions, and more. Over 2,400 different terms.
- AP Person
Celebrities, artists, designers, authors, business leaders, political figures, sports figures, royalty, and other newsmakers known at the global or US national level. Coverage is especially broad for US newsmakers in politics, entertainment and sports, including complete rosters for major professional sports teams, men’s NCAA Division I basketball and football players, all US officeholders at the federal and gubernatorial levels, and all candidates for those offices. More than 106,000 individuals covered.
- AP Company
Over 50,000 publicly-traded companies – including all companies with primary shares trading on any of 70 major global stock exchanges, or trading as ADRs on an American exchange.
What about updates?
The AP News Taxonomy is constantly being updated to capture the latest news and the biggest newsmakers. Whether it’s this week’s IPOs or the new crop of college athletes, AP’s taxonomy developers are always working to keep the vocabularies current and relevant.
Subscribers are kept up-to-date in real time – as soon as a change is published, the new version becomes available in the AP News Taxonomy. Numerical versioning keeps the changes organized and in synch with the data provided by the AP Tagging Service. A detailed log of all changes is accessible through a separate API. You can keep track of all changes, or just the ones you care most about.
How does the service work?
The taxonomy is accessed by making calls to an API (Application Programming Interface). The subscriber can request the full set of terms in a given vocabulary, a subset of terms, or information about a particular term. Calls are also available for retrieving deprecated terms, term change logs, and additional information about the structure of the taxonomy. Taxonomy data can be returned in a variety of Semantic-Web compatible formats, including RDF (XML or TTL) and NewsML-G2. It can also be returned in HTML format. A comprehensive Developer’s Guide provides all the necessary details.
Use the links below to see a sample of AP Taxonomy data in each of the available formats.
AP Tagging Service
What is the tagging service?
The AP Tagging Service analyzes English-language news content and automatically returns relevant metadata, using standardized terminology from the AP News Taxonomy. The process identifies people, companies, geographic locations, organizations, and a wide array of subjects. Although the system will recognize and return specific entities that it finds in the submitted content (aka “text extraction”), it also goes beyond that, using human-created semantic rules to identify topics that may not be explicitly mentioned in the text at all. For example, a story about a particular country music star can trigger the “Country music” subject, even if the word “music” does not appear in the story. Human-managed rules allow for more precise control over the performance of the service.
What types of metadata does the service provide?
Drawing from the AP News Taxonomy, the tagging service looks at each piece of submitted content and returns standardized names and IDs for all types of metadata. The following types of data are available, and the user can specify which types should be returned by the service.
After all the matching metadata values have been identified, the service checks for additional standardized names and IDs based on relationships stored in the AP News Taxonomy. For instance, the subject hierarchy will ensure that any item tagged with “Food safety” will also be tagged with “Health”, and any content that picks up a sports league will be tagged with the relevant sport subject.
Finally, the metadata output will be enhanced with additional data properties. Companies will be given a ticker; athletes will be associated to their teams; geographic locations will get latitude and longitude data; and so forth. Users can also access the AP News Taxonomy for additional information about any given tag.
How does the service work?
The tagging service is accessed by making calls to an API (Application Programming Interface). Subscribers may submit content in plain text or XML, and can specify which types of data should be returned.
Tagging service data can be returned in a variety of Semantic-Web compatible formats, including RDF (XML or TTL) and NewsML-G2. It can also be returned in Simple XML format. A comprehensive Developer’s Guide provides all the necessary details.
Use the links below to see a sample of AP Tagging Service data in each of the available formats.
AP Metadata Services Documentation
- AP Metadata Services Overview
Get a quick introduction to the APMS (PDF).
- Developer's Guide
Review the complete documentation of the API methods (PDF).
- API Console
Interact with the APMS API. Discover, test and debug live calls.
- Code Examples (PDF)
Get a head start with code samples in C# and Java. The samples are currently available in the Developer's Guide on page 24.