Blog

Drupal: Working with Data Imports Using Feeds

Jason Narog
, Drupal Developer

Importing content into Drupal has been through a few iterations in the last few years with Feed API, Node Import, and Feeds (the successor of Feed API) all being useful modules in Drupal 6. For some jobs you would use one, for other jobs you would use the other based on the type of content you were trying to import – with Node Import being more successful with images and alt tags.

But in Drupal 7 there is no Node Import Module. There is only Feeds, and a variety of helper modules, to accomplish a good amount of tasks for importing content. The most useful of these is the Feeds Tamper module. Want to import into a multi (or unlimited) value field using feeds? Grab Feeds Tamper and on your field mapping page click the Tamper tab. Now find the field you want to import into and add an Explode method along with the character (or characters) you would like to use as a separator and you can now import into a multi value field. I’m partial to the double pipe ( || ) separator.

Feeds is useful for taking large amounts of content and importing it into your defined content types through RSS, Excel, or XML. You group the content together, set up your Feeds importer according to your specs, either importing a document or importing from a url with your selected mapping fields and you’re good to go. Totally brand new to feeds? No worries, lets step through all the available steps one by one.

Navigate to Structure and click on Feeds Importers. Now click Add Importer. Give it a name that is meaningful and a description and click Create.

Click the settings link under basic settings. Here you can either attach your feeds to a specific content type or use the standalone form available at yourwebsite.com/import as well as set up how often imports happen (which is based on how often your cron runs) as well as whether this should be imported immediately and whether your site should batch process (good for very large imports) or just run the import until it’s done. You’ll typically attach a feed if you intend on importing content from another website and you’ll use the normal importer if you are uploading a one-time document.

Next up are our fetcher options. Clicking change will give you the option of uploading a document or importing a file from the web. This will affect your settings below. File upload will allow you to define what file extension types are allowed to be imported. HTTP fetcher will allow you to specify settings related to your external import as well as what it should do if it can’t find the specified feed.

Parser options allow you to select what type of file you’ll be working with. Common syndication parser is primarily for importing existing RSS feed content into a website, in the case of a content aggregator website.  CSV parser allows you to import comma delimited, tab delimited and a variety of other delimited files created in programs such as Microsoft Excel. I personally have never worked with the OPML parser or the XML sitemaps parser, but both refer to different types of XML documents. There are no configuration options available for any of the three XML options. CSV allows you to define a default delimiter and choose whether or not your document has headers. These settings can both be overwritten when importing. 

Processor allows you to change what type of entity you are importing. By default your options are node, taxonomy term, and user. For node processor you can select the node type you want to import into, what to do if it encounters nodes that already exist, text format, author, and whether or not the node expires. Taxonomy works in a similar matter, with your node selection being replaced by taxonomy selection. Users allows you to set default roles.

The meat and potatoes of Feeds is found in mapping. Here you can define what columns or defined XML fields group to a particular field associated with your processor. You can import images, text fields, emails, phone numbers, addresses, etc.

As of writing this article, I’ve yet to find a way to import alt and title tags for images. But you can import field collections, if you have the field collections feeds module. This requires building a special field collection feed importer that has to be run after your initial import, but you can update your field collection bundles. Instead of selecting node, taxonomy, or user, you have a new processor option of ‘field collection processor.’ Setting it up is somewhat tricky and is covered in Drupal Field Collection Feeds Imports.

But the best part of feeds is that you can bundle them into a feature. This allows you to set up reusable feed importers for content types that you use on a variety of websites without having to go back and set them up again. It also allows you to take the data out of the database and place it into a module in case you accidently adjust a field you didn’t mean to.

Should you use feeds on every project? No. It might be a bit overkill for a smaller site that only has a few pages of content and won’t be updated very often. But it’s great when you know you’ll be working with a lot of data or want to set up an aggregator site to fetch content from a variety of sources.

Read more from our Blog:
Local SEO Is the Way to Go
Don't Fall Behind: Get a Responsive Website
ADA & Website History Brief: Disabled Accommodation for Web
The Rise and Fall of Adobe Flash
 

Add new comment