Add value to your news content with AP Taxonomy and Tagging

AP's industry-leading metadata is accurate, comprehensive, richly detailed data, designed specifically for use by news publishers. AP Metadata Services APIs give you direct access to the same metadata system that supports AP's award-winning global news operation.

What are the Metadata Services components?

  • News Taxonomy, a comprehensive set of standardized AP vocabularies built for news, as well as mappings to other taxonomies, such as the IAB Tech Lab Content Taxonomy and any custom taxonomy terms added by the user.
  • Tagging Service, an auto-classification system that enriches your content with relevant metadata tags from the AP News Taxonomy as well as from others, such as the IAB Tech Lab Content Taxonomy or AP Core tags - a set of basic categories based on the AP Subject taxonomy.

What are the benefits?

Good metadata offers a variety of benefits, opening up new possibilities for connecting with readers and managing content:

  • Deliver targeted, relevant news products based on particular topics.
  • Create engaging search and discovery experiences for your readers.
  • Support contextual advertising.
  • Use detailed content analytics to inform editorial coverage and planning.
Quick Links

Overview
Get a quick introduction to the AP Metadata Services (PDF).

Solutions and Use Cases
Find out how you can take advantage of AP metadata to enrich your news content (PDF).

Developer's Guide
Review the complete documentation of API methods (PDF).

API Explorer (Swagger)
Interact with the API. Discover, test and debug live calls.

GitHub
View code samples and ontology files.

RDF/XML, RDF/TTL, JSON-LD, HTML
View taxonomy data samples

RDF/XML, RDF/TTL, n-triples, JSON-LD, Simple XML
View tagging service data samples


Talk to us

Contact us at apmetadata@ap.org

News Taxonomy

A comprehensive classification system for English-language news content

  • AP News Taxonomy includes standardized subjects, planned and breaking news events, people, organizations, geographic locations and more, all designed with news content in mind. Frequent updates ensure timeliness, and a rich network of semantic relationships between concepts enables creative solutions for content linking, search and discovery.
  • Mappings to the IAB Tech Lab Content Taxonomy can be added to your service to augment your contextual advertising.
  • You can also add any other custom taxonomy terms or vocabularies you may need via an API.

How can I use it?

Integrate the News Taxonomy into your publishing systems to support manual tagging, or use it in conjunction with the AP Tagging service to apply the taxonomy to your content.

In addition to standardized terminology and unique IDs, the taxonomy stores a variety of details about the people, places, and topics it contains.

This information can power enhanced search experiences, browsing and discovery, or informational displays; for example:

  • Synonyms, acronyms, and spelling variants.
  • Properties of people, places and things – such as an athlete's uniform number, the latitude and longitude of a geographic location, or the stock ticker symbol for a company.
  • Relationships between concepts, such as between two people (Parent-Child), or between a person and a group or organization (Athlete-Team).
  • Hierarchical structure for subjects and geographic locations, to enable both broad and narrow searches.
  • Mappings to IAB Tech Lab Content Taxonomy for contextual advertising support.
  • Optionally, relationships between AP News Taxonomy terms and your own custom vocabularies that are crucial to your workflow.

What is included in the News Taxonomy?

AP News Taxonomy

The AP News Taxonomy has five main areas of coverage:

  • AP Subject
    A wide variety of hierarchically structured topics ranging from broad categories (Crime) to specific concepts (Illegal firearms). Also includes many named events such as Academy Awards and Tour de France. More than 4,200 terms in all.

  • AP Geography
    Over 2,500 geographic place names arranged hierarchically – continents, world regions, countries, territories, national capitals, major world cities, US states, Canadian provinces, and a large number of US cities and towns.

  • AP Organization
    Organizations and institutions from a wide variety of sectors: government organizations, non-profits, sports teams, colleges and universities, political and ideological groups, cultural institutions, and more. Over 2,500 different terms.

  • AP Person
    Celebrities, artists, designers, authors, business leaders, political figures, sports figures, royalty, and other newsmakers known at the global or US national level. Coverage is especially broad for US newsmakers in politics, entertainment and sports, including complete rosters for major professional sports teams, men’s NCAA Division I basketball and football players, all US officeholders at the federal and gubernatorial levels, and all candidates for those offices. More than 160,000 individuals covered.

  • AP Company
    Over 70,000 publicly-traded companies – including all companies with primary shares trading on any of 70 major global stock exchanges, or trading as ADRs on an American exchange.
IAB Tech Lab Content Taxonomy Mappings

The IAB Tech Lab Content Taxonomy is the Interactive Advertising Bureau's standard collection of content classification categories. The taxonomy service gives you access to mappings between the AP News Taxonomy's Subject terms and the IAB's classification categories.

Custom Taxonomy Integration

We also offer an API for you to add your own custom taxonomy to this collection. You can link your taxonomy terms to AP's, for example adding local towns or neighborhoods to AP's existing Geography terms, or you can maintain your own separate vocabularies.

Additional taxonomy mappings and features will be added on an ongoing basis - please check back here for updates.

What about taxonomy updates?

The AP News Taxonomy is constantly being updated to capture the latest news and the biggest newsmakers. Whether it’s this week’s IPOs or the new crop of college athletes, AP’s taxonomy developers are always working to keep the vocabularies current and relevant.

Subscribers are kept up-to-date in real time – as soon as a change is published, the new version becomes available in the AP News Taxonomy. Numerical versioning keeps the changes organized and in sync with the data provided by the AP Tagging Service. A detailed log of all changes is accessible through a separate API. You can keep track of all changes, or just the ones you care most about.

Taxonomy mappings are updated as external taxonomy updates become available, and new mappings are completed.

How does it work?

Subscribers can access the taxonomy by making calls to an API and request the full set of terms in a given vocabulary, a subset of terms, or information about a particular term. Calls are also available for retrieving deprecated terms, term change logs, and additional information about the structure of the taxonomy. Taxonomy data can be returned in Semantic-Web compatible formats, such as RDF/XML, RDF/TTL or JSON-LD. Information about a particular term can also be returned as HTML.

To learn more, refer to the Developer's Guide.

Tagging Service

Enrich your content with tags from AP News Taxonomy or other standard taxonomies

AP Tagging service receives your English-language news content and automatically returns relevant metadata, using standardized terminology from the AP News Taxonomy, AP Core categories or the IAB Tech Lab Content Taxonomy.

This smart service goes well beyond mere text extraction; it uses human-created semantic rules to understand your content and identify the most pertinent concepts and topics.

Why human-created rules?

Human-created rules allow for more precise control over the service performance.

The system will recognize and return specific entities that it finds in the submitted content (aka "text extraction") and also uses human-created semantic rules to identify topics that may not be explicitly mentioned in the text at all.

For example, a story about a particular country music star can trigger the "Country music" subject, even if the word "music" does not appear in the story.

What types of metadata are returned?

The tagging service looks at each piece of submitted content and returns standardized names, IDs and other properties for all the relevant taxonomy terms that triggered a semantic rule. You can specify which taxonomy you want applied - AP's, IAB's or Core - and which of the five areas of AP News Taxonomy's coverage the service should return.

After all the matching metadata values have been identified, the service checks for additional standardized names and IDs based on relationships stored in the AP News Taxonomy. For example, the subject hierarchy ensures that any item tagged with "Food safety" is also tagged with "Health", and any content that picks up a sports league is tagged with the relevant sport subject.

Finally, the metadata output is enhanced with additional data properties. For example, companies are given a ticker, athletes are associated with their teams, and geographic locations get latitude and longitude data. You can also access the AP News Taxonomy or the IAB Tech Lab Content Taxonomy mapping for additional information about any given tag.

Additional tagging features will be added on an ongoing basis - please check back here for updates.

How does it work?

Subscribers can access the tagging service by making calls to an API either synchronously, or by submitting multiple documents simultaneously and retrieving the tags at a later time. Content may be submitted as plain text or XML.

Tagging service data can be returned in a Semantic-Web compatible formats, such as RDF/XML, RDF/TTL, n-triples or JSON-LD. It can also be returned in Simple XML format. You can integrate with the Tagging API through Zapier.

To learn more, refer to the Developer's Guide.