Monday, 14 July 2008

Snippet: Copy MySQL Databases Over SSH

I needed to copy a database and the idea of backing it up just to re-import1 seemed like double the work. Here’s a snippet to pipe a mysqldump into a remote database. Keep an eye on the user names and passwords – you’ll need 3 sets; one for the database your copying, one to get into your remote server, and one for the remote, target database.

mysqldump -v -uUSER -pPASSWORD --opt --compress DATABASE_NAME | ssh REMOTE_SERVER_USER@REMOTE_HOSTNAME mysql -uREMOTE-MYSQL-USER -pREMOTE_MYSQL_PASSWORD REMOTE_DATABASE_NAME

1. Backing up is a good thing. Why aren’t you doing it? Here’s a script for that.
mysqldump -h HOSTNAME DATABASE_NAME | gzip -9 > BACKUP_DIR/DATABASE_NAME.sql.gz

Sunday, 24 February 2008

Optimization Tips: Ruby on Rails and MySQL

Stop.

I’ve uncovered these tips after (at least) the 3rd refactoring effort of some fairly simple, straight-forward Rails code. Rails is great for getting ideas prototyped super fast. These tips will slow down development and make apps less portable. Continue reading only if you’re running a live app in production and not happy with how resource-intensive it is.

My approach to this round of optimization was watch the Load calculations in the development log and optimize transactions with a Load greater than 0.0009.

  • First, a rule of thumb: Development boxes are faster than production boxes. If it’s acceptably fast locally, then it’ll probably be a turtle in production.
  • Use Model.find_by_sql or Model.count_by_sql whenever possible.
    Slow: Person.find_by_name('JoeyJoeJoe')
    Fast: Person.find_by_sql("SELECT person.* WHERE person.name = 'JoeyJoeJoe'")
  • Don’t put keys, IDs, or other numbers within quotes in your Finds
    Slow: WHERE id = "1234"
    Fast: WHERE id = 1234
  • Only request the specific database column/model attribute you want
    Slow: Person.find_by_name('JoeyJoeJoe').height
    Fast: Person.find_by_sql("SELECT person.height WHERE person.name = 'JoeyJoeJoe'"). Put indexes on all these columns/attributes.
  • Use connection.insert, connection.update, connection.delete for database transactions performed on an array of models or transactions that don’t need the overhead of a model.
  • Slow: ...WHERE table_1.id = table_2.table1_id...
    Fast: ...table_1 JOIN table_2 ON table1.id = table_2.table1_id....
  • Many tiny database transactions are faster than 1 big one
    Slow: stuff = Stuff.find_by_sql("SELECT everything.* FROM everything JOIN (box_1, box_2) ON (everything.id = box_1.everything_id box_2.box1_id = box_1.id")
    Fast: boxes = Boxes.find_by_sql("SELECT box_1.everything_id FROM box_1 JOIN box_2 ON box_2.box1_id = box_1.id")
    followed by
    boxes.each do |box_1|
    stuff = Stuff.find_by_sql("SELECT everything.* FROM everything WHERE id = #{box_1.everything_id}
    end
  • If you’re running the InnoDB datastore (vs MyISAM) try cranking up your innodb_buffer_pool_size. I say start by doubling it. Seriously.
  • Again, if you’re running InnoDB and doing a anything more involved than the simplest ORDER BY, try cranking up your sort_buffer_size and read_rnd_buffer_size to something in the double-digit M range.
  • Also, if you’re comparing datastore engines, MyISAM has full text search, InnoDB doesn’t. So you’ll need a clever work around. There are a number of them.