Migration from Drupal to WordPress

I’ve been working on migrating a fairly large site from Drupal to WordPress (actually WP is just a middle ground, it’s moving from there to another platform) and noticed a lack of decent information about Drupal to WP migration.  Since I had to put some time in cleaning up MySQL scripts, and working through the move, I thought it’d be a good idea to document the information for anyone who needs to make the migration in the future.

First off, the original Drupal site is at version 6.22 and I’m migrating into WP 3.4.2.

Initial steps:

  1. Get MySQL access to the Drupal site, either command line via SSH or PhpMyAdmin.
  2. Grab a full dump of the DB (yes, you may not need the whole thing, but this way you know you’ve got all the info you might need).  The Drupal db backup module in my case was incapable of backing up this db.
  3. Create a new database on the system you will use for migration and call it “drupal” – in my case I am working locally on my Xampp MySQL db.
  4. Import the db dump into your new database from the command line.  Don’t waste your time trying to use PhpMyAdmin if you’ve got a decent size backup (mine was over 3 gigs).  mysql -u username -p drupal < drupal.sql is the syntax (replace username with your mysql username and be prepared to be asked for the db password).  The import may take a few minutes for a large database. (*note: if you are having trouble getting the import to work, you likely have mysql buffer sizes set too low.  I found this to be the case with the XAMPP install of mysql, but I easily fixed it by editing my.ini via the XAMPP control panel.  I think the innodb_buffer_pool_size setting was the one that did it, but most of the buffers were upped substantially.  You can safely up innodb_buffer_pool_size to about 50% of your system memory.
  5. So now you’ve got your own import of the Drupal database.  Go ahead and create your new WP database if you haven’t already and do your WP install.

Now we’re ready to start migrating data.  First off, I import categories, but I need to warn you: WP requires unique category names, and apparently Drupal apparently does not.  You will most likely need to make sure that you don’t have Drupal Term_Data entries with names that are similar, ie, Immigration and immigration.  Change the names to something unique, like changing immigration to immigration issues and it will work.. Thus, you need to make sure that all the Category names are unique.

This script which I have slightly modified from the original from Lincoln Hawks at SocialCmsBuzz.com did the trick.  I have also added user import, which wasn’t in the original.  My WP database name is wptest and the drupal db is named as noted before “drupal”.

TRUNCATE TABLE wptest.wp_comments;
TRUNCATE TABLE wptest.wp_links;
TRUNCATE TABLE wptest.wp_postmeta;
TRUNCATE TABLE wptest.wp_posts;
TRUNCATE TABLE wptest.wp_term_relationships;
TRUNCATE TABLE wptest.wp_term_taxonomy;
TRUNCATE TABLE wptest.wp_terms;

INSERT INTO wptest.wp_terms (term_id, `name`, slug, term_group)
SELECT
d.tid, d.name, REPLACE(LOWER(d.name), ‘ ‘, ‘-‘), 0
FROM drupal.term_data d
INNER JOIN drupal.term_hierarchy h
USING(tid)
;

INSERT INTO wptest.wp_term_taxonomy (term_id, taxonomy, description, parent)
SELECT
d.tid `term_id`,
‘category’ `taxonomy`,
d.description `description`,
h.parent `parent`
FROM drupal.term_data d
INNER JOIN drupal.term_hierarchy h
USING(tid)
;

INSERT INTO
wptest.wp_posts (id, post_date, post_content, post_title,
post_excerpt, post_name, post_modified)
SELECT DISTINCT
n.nid, FROM_UNIXTIME(created), body, n.title,
teaser,
REPLACE(REPLACE(REPLACE(REPLACE(LOWER(n.title),’ ‘, ‘-‘),’.’, ‘-‘),’,’, ‘-‘),’+’, ‘-‘),
FROM_UNIXTIME(changed)
FROM drupal.node n, drupal.node_revisions r
WHERE n.vid = r.vid

INSERT INTO wptest.wp_term_relationships (object_id, term_taxonomy_id)
SELECT nid, tid FROM drupal.term_node;
UPDATE wp_term_taxonomy tt
SET `count` = (
SELECT COUNT(tr.object_id)
FROM wp_term_relationships tr
WHERE tr.term_taxonomy_id = tt.term_taxonomy_id
);

INSERT INTO wptest.wp_comments (comment_post_ID, comment_date, comment_content, comment_parent, comment_author, comment_author_email, comment_author_url, comment_approved)

SELECT nid, FROM_UNIXTIME(timestamp), comment, thread, name, mail, homepage, status FROM drupal.comments;

UPDATE `wp_posts` SET `comment_count` = (SELECT COUNT(`comment_post_id`) FROM `wp_comments` WHERE `wp_posts`.`id` = `wp_comments`.`comment_post_id`);

UPDATE wptest.wp_posts SET post_content = REPLACE(post_content, ”, ”);

UPDATE wptest.wp_posts SET post_content = REPLACE(post_content, ‘”/sites/default/files/’, ‘”/wp-content/uploads/’);
INSERT IGNORE INTO wptest.wp_users SELECT NULL AS ID, NAME AS user_login, SUBSTRING(MD5(RAND()) FROM 1 FOR 30) AS user_pass, NAME AS user_nicename, mail AS user_email, ” AS user_url, FROM_UNIXTIME(created) AS user_registered, ” AS user_activation_key, 0 AS user_status, NAME AS display_name FROM drupal.users;

UPDATE wptest.wp_posts JOIN drupal.node ON title = post_title JOIN drupal.users ON drupal.users.uid = drupal.node.uid JOIN wptest.wp_users ON drupal.users.name = wptest.wp_users.user_nicename SET wptest.wp_posts.post_author = wptest.wp_users.ID

 

One note for you: I found the line where we updated our content image links didn’t really take care of what I needed.  My Drupal db was full of absolute links for image sources so I ended up having to change each by doing something like: UPDATE wptest.wp_posts SET post_content = REPLACE(post_content, ‘my.drupal.site/sites/default/files/’, ‘/wp-content/uploads/oldimages’); for each of the image urls.  I then mass imported the images via ftp into my wp-content/uploads/oldimages/ directory.  This meant they weren’t included in the WP media library as I would prefer, but at least the images were there.

Let me know if you’ve got improvements, especially if you write a script to import the images into the media library!

 

Again with the “WordPress Isn’t Secure” Meme

As I was going about my morning reading, I came across an article with this dire headline: “The Perils of Using WordPress as a Hotel Website Content Management System.”  Of course, being a professional who spends much of his time working on WordPress, my interest was piqued.  From the article:

WordPress technology is ill-fitted to power hotel websites’ content management systems and is only adequate as a blogging technology.

Hmmm…that’s a pretty serious allegation.  So I read on.  The crux of his argument was that a WordPress system can be hacked using the technique described in the post contained in this article. So where is the fatal flaw?  Apparently if a user creates an insecure password, the system can be breached by blunt force.

Blunt force.  Right.  So if you were to ignore WordPress’ own warnings that your password was not strong, you might be hackable. This is not a system problem, it is a user problem.  It is in fact a problem inherent in computing in general and in any system which uses passwords.

Read on further and you’ll find the co-writers of this article have designed their own hotel content management system.  I’m going to guess that they don’t use passwords though, since those would be insecure.  But I will venture a guess without looking at their system:  it is neither open source,  nor are there millions of users who are trained and ready to work in the system.

I could continue ripping their post apart, but on further reading, it is an obvious attempt to get some Google juice for their site. They are simply not worth it.  If you have a hotel and want a simple, easy to use, and effective hotel website, drop me a line and I’ll get you set up for a fraction of what they’d charge you.  In the long run, you’ll be better off.

 

Death of Journalism – In the end, it was us all along

In the end, it wasn’t the Internet, or televsion or even bad management that killed journalism; we did it to ourselves.

When I was in college, we held up the Bob Woodwards of the world as our mentors.  The story mattered above all else.  No matter who it was, it was the journalists job to expose corruption.  We were to be that shining light in dark places.

This election cycle has shown us the lie in that.  We’ve got supposed journalists falling all over each other to produce dubious “fact checks” based on a pyramid of half truths and deceptions carefully spoon fed by the campaigns.  The keyword now in politics is “control the narrative” and that means finding ways to get their version of the story out.

To be clear, as a journalist, you cannot spin a story.  Propagandists spin stories, PR flacks spin stories, journalist report.  We are meant to be that sharp probe stuck into sensitive areas.  Not some dull mouthpiece regurgitating the party line.

It’s not that it’s one side or the other that is the problem, it is that it have become utterly apparent to everyone that media has a side.  The right, Fox, and in opposition MSNBC. Too many issues have been left alone.  Far too many.

For example, how is it we have an ambassador killed in Libya, yet none find fault with our government.  To be clear, their “it was the video” mantra was an obvious fabrication. The truth, that it was a coordinated attack by terrorists, and even worse, that we had repeatedly denied requests for more security, and in fact, removed 34 security personnel over the previous 6 months, was concealed and only now comes to light.

Are there still bright points: to be certain.  One has only to look at the work done by Univision on the Fast and Furious scandal.  Another story most media shied away from for over a year. (I link to ABC News – their English language partner)

To be fair, I could just as easily link to instances of press ignoring CIA lies about Iraq having weapons of mass destruction etc.

The death of journalism comes when journalists look the other way.  When journalists give up.  When they become propagandists.

We cannot allow journalism to die.

Blue Ice – the new Techno-thriller by Mark N. Cahill

blue_ice_cover-193x30026 years in the making…I finally got the proof copy of my novel, Blue Ice.

It’s true – I started writing it, and in fact, most of the first chapter remains from 1986, when I began writing it in Manomet.  From the jacket copy:

I started writing Blue Ice in 1986, on a cold windy night in the house I lived in on the beach in Manomet, MA, using an IBM PCjr who’s only storage was a 5 1/2″ floppy drive. The first few pages were written that night, and remain pretty much as I wrote them then. I pecked at the book on and off for several years, until I finally got the resolve in 1997 and sat down to finish the job.

I spent some time during 1997 sending out copies to publishers, and twice came very close, making two separate publication lists for 1998, but in each case I was bumped for an established writer rather late in the game. Prior to starting work on my next book, I vowed that I would have to publish the first one. So here it is.

There’s still a bit of a way to go.  I expect that October 15 as a release date is wishful thinking.  Especially since I haven’t begun to do the final proofread. The good news for you is you now know what you’ll be giving everyone on your Christmas/Hanukkah/Kwanza/Festivus list.

You can find out more about the book at BlueIceBook.com (that site in and of itself is a story, but I will tell that story in a later post).

Oracle moves emperil web development with Mysql

Often the biggest threats come not from without, but from within.

There’s a big issue looming for web and mobile developers, and if it happens, it will affect virtually everyone that uses the web.  The problem is that a very large proportion of database driven websites and applications (think of the stuff on your phone) is built using a MySQL database.  We developers used that particular database, in general, because it was free, and most websites can’t afford the license cost of Microsoft SQL (think $5k a server or more) or, even worse, Oracle (think$10k a processor per server).

Everything was fine until one day, Sun bought MySQL. That worried us, but then our fears compounded when about a year later, Oracle bough Sun.  The web’s most important Open Source database was in the hands of the company that sells it’s huge enterprise big brother.  The guys that make huge dollars selling databases for big bucks.

We shuddered. Obviously the move was worthy of Standard Oil in it’s heyday, one summed up in the single word: monopoly.  Meanwhile the Oracle PR team went into overdrive telling us it simply wasn’t true.

So we held our breath and waited.

On Saturday, TechCrunch posted:

Oracle is holding back test cases in the latest release of MySQL. It’s a move that has all the markings of the company’s continued efforts to further close up the open source software and alienate the MySQL developer community.

The issue stems back to a recent discovery that the latest MySQL release has bug fixes but without a single one having any test cases associated with it.  That creates all sorts of problems for developers who have no assurance that the problem is actually fixed.

Open source software relies on transparency.  As consumers (meaning site owners, business owners, etc.) we need to know what is going on with the code.  What changed, what is going to change, etc.

There is simply too much invested in MySQL by too many of us.  A large portion of the online economy is built on MySQL; sites like Wikipedia, American Airlines, TicketMaster,
Zapphos, etc.   MySQL is the database of the Internet.

Deep down, even with $$$$ invested in MySQL, we’ve got to worry that Oracle has a strong vested interest in seeing MySQL go away. After all, when you own the most expensive of enterprise databases, you’re view towards “open and free” is going to be dark and black.

Is this then end?  No. However this is a situation which potentially could affect all of us, and bears close watching.

Web Design and Engineering Posts at CahillDigital.com

 

I’ve started blogging on my new company site – http://cahilldigital.com so you will definitely want to add that to your reading list.  My last post was about Responsive Design, Retina Displays and their importance in new site development.  The title is “Welcome to Retina-stan.”

Meanwhile in the present day, designers have to deal with a web that requires a “responsive” design – on that by using media queries will present the appropriate css 3 styles to a browser, be it on a phone, tablet, laptop or whatever. Essentially, we’ve created a variable driven CSS; one of the things we realized virtually from the outset of CSS that was missing.  Say good bye to those funky old conditional comments calls that allowed use to work around that failing.

Now we have retinal displays, which unlike the phones and tablets that require different widths and sizes, actually changes the pixel density of the images and fonts that are presented.  It’s all pretty cool, but honestly, its also more work if you want to do it right.

As usual, a post packed full of useful development goodness, but also important for small business owners. Check it out at CahillDigital.com

Red Sox Pull Off the Impossible

I’m going to state this clearly right at the outset: the Red Sox lost me last September with their heartless, uninspired play.  This season I’ve watched more Orioles baseball than Red Sox, and the little Red Sox I did watch sickened me.

Since last year all we heard was that it would be impossible for the team to unload Beckett, Crawford, etc. due to their tainted image (in Crawford’s case, total lack of performance as well) and bloated salaries.  Not without the Sox essentially paying for them to play somewhere else.

So like so many other baseball truths, we find that the real case is you can’t unload them until you actually try to unload them. Apparently we hit that point with the recent clubhouse turmoil, and the utterly disgusting on field performance.  It was obvious, this was a group of millionaires whose only commonality was they had paychecks written by the same management group. This was certainly not a team.

Shipping out Beckett, Crawford and Gonzalez was a good start.  Dispatching their payroll, in almost its entirety to Magic Johnson and the Dodgers is a giant step.  Notable after the deal, Alfredo Aceves was suspended for 3 games for complaining in the clubhouse.  Perhaps a new day is dawning.

I’m not ready to jump back on the Red Sox bandwagon.  They’ve got a lot of things to do before that happens.  But the move is one in the right direction and I’m guardedly optimistic.  If it were given to me as an option, I’d rather see a raft of Pawtucket prospects playing their hearts out than a return to the soulless superstar zombies we’ve seen over the past two years.

So where to from here?

  • It is painfully obvious that a clubhouse leader is needed.  
  • It is similarly obvious that anyone added to the team better have the work ethic.
  • Clubhouse cancer must be and will be exorcised, no matter the amount of pain.
  • Bobby V., we hardly knew ye…yes, I think he’s going to have to go. As will virtually every other coach.  We need to build from scratch.
  • Some degree of management restructuring is called for.  John Henry and Larry Luccino must bear a good part of the blame.
In someways, it is fittingly ironic that even in death Johnny Pesky has been a force for good with the club.  I truly believe that the players shameful turnout on Monday for his funeral was the absolute last straw for management, and who could blame them?
I expect we’ll see big moves over the remainder of the season, perhaps a good chance for us to see what we’ve got in the minors, and over the winter, I think the fabric of the team will change substantially.
As for me, I’m intrigued, but they definitely will need to show me more.

 

*Update:
I may be wrong about the reasons – in this post WickedClevah posits that the deal had more to do with an impending 2013 50% luxury tax on the overly large Red Sox payroll.

 

If John Henry did not like the previous luxury tax system – and he did not, to the tune of $500,000 – it seemed safe to assume that the new CBA with its more onerous luxury tax provisions would have a substantial impact on the Red Sox payroll and operational structure moving forward. And while the head of the players union downplayed the notion that the new CBA would constrain Red Sox (or Yankee) payrolls as recently as March of this year, the Marco Scutaro trade two months before was proof enough that the times were changing. When the Red Sox trade their starting shortstop for a long relief candidate simply because the trading partner will pay a one year $6M commitment, it’s difficult to argue that it’s business as usual. As Keith Law said at the time, “You don’t dump a 3 win player making $6MM for no return.”

 

So I guess there are a lot of reasons to go ahead.  It might also be that while the Pesky Funeral embarrassment makes a good justification, the true goal is, as Mikey Corleone might tell us, “just business”.

An End to the Tyranny of the Commute

My years at Namemedia ended last month and I’m no longer forced to commute to Waltham from Central MA anymore.  While I miss everyone I worked with at Namemedia, I certainly do not miss the commute.

Our workdays are long enough.  When you start adding an additional 1.5 hours each way, minimum, to your commute, it gets downright awful.  Then, add on top the fact that at least once a week you can expect a 2 hour or longer commute, usually in the morning with a tie up at the Rt. 128 Tolls.  In the end it leaves little time for anything else in your life.

So you compensate by trying to go into the office a little earlier to miss some of the traffic.  Or you stay later to miss the worst of the rush.  The next thing you know you’ve committed 14 hours of your day to the job. You simply aren’t left with much.

So you get home, and everyone wonders why you’re zombie-like; you just want to sit and relax for a little while, then off to bed so you can get up early and start it all over again.

In my many years of commuting on the Mass Pike, I learned a lot.  Here are a few tips:

  • Sneaky Alternate Routes Rule – No, I won’t publish mine here, but there are other ways.  When in doubt, avail yourself of the other options.
  • Aggressive Drivers Suck – newsflash: driving 90 weaving in and out when everyone else is doing 70 doesn’t get you there any faster.  I generally pass clowns like you at the tolls.  And no, that’s not a “you’re number one” signal everyone is giving you.
  • Easy Pass Costs You 30 Minutes Every Morning – On the morning commute, at Rt. 128 Tolls, the Easy Pass lanes back up for at least 1/4 mile.  However, if you pay cash, you can generally pass ALL that traffic and drive right up to the booth on the far left.
  • Boston Radio is Dead – we knew it was done when BCN shed it’s mortal coils.  There is no radio in Boston save talk radio and NPR.  I generally opt for podcasts through my smartphone.  Where for art thou, Dwayne Glasscock!
  • Car Are Tools – if you’re going to commute over 100 miles a day, your car is a tool and needs to be treated as such.  You need to get the maximum longevity out of it, and you need to be a slave to routine maintenance. Buy cars for durability, buy used, and make sure it’s comfortable, because you’re going to spend a good chunk of your life in there.
  • Auto Costs Add Up – if you commute 100 miles a day, 5 days a week that’s 500 miles – call it two tanks of gas a week for me, minimum.  At $3.80 a gallon, that’s around $100 a week. Add on tolls, maintenance, and the fact that you are now the grim reaper of motor vehicles, and you’ve got a solid 12-15k a year in costs.  Conservatively…

Thank God the Tyranny is over!