Skip to main content

Extract XMP Metadata


Extract XMP meta-data from PDF documents using the pdf-xmp endpoint.


The pdf-xmp endpoint is for extracting XMP meta-data from PDF documents. In this tutorial we demonstrate just how easy it is to extract XMP meta-data from a PDF document via the pdf-xmp endpoint. We first call the pdf-xmp endpoint directly using REST. We then use the DynamicPDF client libraries to illustrate using pdf-xmp with the C#, Java, Node.js, and PHP client libraries.

Required Resources#

To complete this tutorial, you must add the Get XMP Metadata sample to your samples folder in your cloud storage space using the Resource Manager. After adding the sample resources, you should see a samples/get-xmp-metadata-pdf-endpoint folder containing the resources for this tutorial.

SampleSample FolderResources
Get XMP Metadatasamples/get-xmp-metadata-pdf-endpointfw4.pdf
  • From the Resource Manager, download fw4.pdf to your local system; here we assume /temp/dynamicpdf-api-samples/get-xmp-metadata.
  • After downloading, delete fw4.pdf from your cloud storage space using the Resource Manager.
ResourceCloud/Local
fw4.pdflocal
tip

See Sample Resources for instructions on adding sample resources.

Obtaining API Key#

This tutorial assumes a valid API key obtained from the DynamicPDF Cloud API's Environment Manager. Refer to the following for instructions on getting an API key.

tip

If you are not familiar with the Resource Manager or Apps and API Keys, refer to the following tutorial and relevant Users Guide pages.

Calling API Directly Using POST#

The pdf-xmp endpoint takes a POST request. When using cURL, you specify the endpoint, the HTTP command, the API key and the local resources required. The following cURL command illustrates.

  • Create a cURL POST request, where you pass the API key as a header and the PDF as binary data.
curl -X POST "https://api.dynamicpdf.com/v1.0/pdf-xmp" -H  "Content-Type: application/pdf"-H  "Authorization: Bearer xxxxxxxx" --data-binary "@c:/temp/dynamicpdf-api-samples/get-xmp-metadata/fw4.pdf"
  • Execute the cURL command and the XML metadata is written to the commandline.
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.2.1-c043 52.398682, 2009/08/10-13:00:47        ">    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">        <rdf:Description rdf:about=""            xmlns:dc="http://purl.org/dc/elements/1.1/">            <dc:format>application/pdf</dc:format>            <dc:subject>                <rdf:Bag>                    <rdf:li>Fillable</rdf:li>                </rdf:Bag>            </dc:subject>            <dc:description>                <rdf:Alt>                    <rdf:li xml:lang="x-default">Employee's Withholding Certificate</rdf:li>                </rdf:Alt>            </dc:description>            <dc:creator>                <rdf:Seq>                    <rdf:li>SE:W:CAR:MP</rdf:li>                </rdf:Seq>            </dc:creator>            <dc:title>                <rdf:Alt>                    <rdf:li xml:lang="x-default">2021 Form W-4</rdf:li>                </rdf:Alt>            </dc:title>        </rdf:Description>        <rdf:Description rdf:about=""            xmlns:xmp="http://ns.adobe.com/xap/1.0/">            <xmp:CreatorTool>Adobe LiveCycle Designer ES 9.0</xmp:CreatorTool>            <xmp:MetadataDate>2020-12-31T09:12:43-05:00</xmp:MetadataDate>            <xmp:ModifyDate>2020-12-31T09:12:43-05:00</xmp:ModifyDate>            <xmp:CreateDate>2020-12-31T09:12:43-05:00</xmp:CreateDate>        </rdf:Description>        <rdf:Description rdf:about=""            xmlns:pdf="http://ns.adobe.com/pdf/1.3/">            <pdf:Producer>Adobe LiveCycle Designer ES 9.0</pdf:Producer>            <pdf:Keywords>Fillable</pdf:Keywords>        </rdf:Description>        <rdf:Description rdf:about=""            xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/">            <xmpMM:DocumentID>uuid:01d97a6e-5605-44ae-8015-54a82bc56c5c</xmpMM:DocumentID>            <xmpMM:InstanceID>uuid:9d6007b3-eacb-4f13-8d6b-da9d46b7dfb3</xmpMM:InstanceID>        </rdf:Description>        <rdf:Description rdf:about=""            xmlns:desc="http://ns.adobe.com/xfa/promoted-desc/">            <desc:embeddedHref rdf:parseType="Resource">                <rdf:value>..\..\..\..\..\..\..\TFACS\Misc\logo\pencil.bmp</rdf:value>                <desc:ref>/template/subform[1]/subform[3]/draw[2]</desc:ref>            </desc:embeddedHref>        </rdf:Description>    </rdf:RDF></x:xmpmeta><?xpacket end="w"?>

Calling Endpoint Using Client Library#

To simplify development, you can also use one of the DynamicPDF Cloud API client libraries. Use the client library of your choice to complete this tutorial section.

Complete Source#

You can access the complete source for this project at one of the following GitHub projects.

LanguageFile NameLocation (package/namespace/etc.)GitHub Project
JavaGetXmpMetaData.javacom.dynamicpdf.api.exampleshttps://github.com/dynamicpdf-api/java-client-examples
C#Program.csGetXmpMetaDatahttps://github.com/dynamicpdf-api/dotnet-client-examples
NodejsGetXmpMetaData.jsnodejs-client-exampleshttps://github.com/dynamicpdf-api/nodejs-client-examples
PHPGetXmpMetaData.phpphp-client-exampleshttps://github.com/dynamicpdf-api/php-client-examples
tip

Click on the language tab of choice to view the tutorial steps for the particular language.


In all four languages, the steps were similar. First, we created a new PdfResource instance by loading the path to the PDF via the constructor. Next, we created a new instance of the PdfXmp class, which abstracts the pdf-xmp endpoint. Then the PdfXmp instance prints the XML metadata after processing. Finally, we called the Process method and printed the resultant XML to the console.