Skip to content

MongoDb or CouchDB or something else?

I just had a chat with a business colleagues about "use mongoDB or CouchDB".
From my point of view, its clear. MongoDB looks like it is not that open source as it should be (correct me if I am wrong), CouchDB is open source, so CouchDB wins.
While doing a search in the net, I found three pages I would like to share.

High-Performance Websites - PHPUG and WebPerformanceUG

Arne Blankerts, Stefan Priebsch - High-Performance Websites

This talk contains information about architecture.

Classical Architecture

When have done this for quite a while.

Browser->WebServer->PHP-Database

This works best in the past. General solution was to add more hardware and that leads to more errors.

  • monolithic architecture
  • normalized data (in database)
  • pull-principle (browser is build per request)
  • full page cache is bad
  • edge site include (for varnish/ESI) is also not the best solution

We are fast when:

  • no real workload per request
  • everything from memory
  • denomalized data
  • snippets

How Often Does Data Change (From Rare To Often)?

Based on the example of a shop system.

  • catalog
  • product information
  • price
  • availability

Who Changes Data?

  • editor
  • product manager
  • receipt of goods / outgoing goods

New Architecture (CQRS-Architecture)

  • images and static content is deliverd by static content webserver
  • only dynamic part is done by php
  • snippets are deliverd, not full pages

PHP->Key-Value-Store->Backend-Process->Database
or
PHP->Web-Service->Database

Benefit of key-value-store (for example redis) is, that the database itself can die and the only thing that happens is, that the entries are becoming obsolete.

But What About Filternavigation?

Use a searchengine that returns simple a collection of product keys. Use this product keys and ask the key value store to fetch product data.

But What About Personalization?

  • use snippets with variables and default values
  • fetch this snippet in the key value store
  • thanks to "time to death" feature provided by many key-value-stores, you can easily define "special offers" per day and so on

Nice meetup, incredible how many people are attending already.

FrOSCon - Beyond LAMP

First talk finished :-). No proofreading done so far. Next talk will start in a few seconds.

Its All About Scale

  • size
  • features
  • non-functional requirements
  • Language agnostic - language you are using doesn't matter

Background Tasks

User can't wait and server can't handle it

  • sending mail
  • calling 3rd party APIs / services
  • converting media (image, video, audio)
  • updating caches

Direct Approach

Don't do that since your are losing control

Slightly Less Bad Idea

  • Write jobs into sql database
  • have some workers that poll the database for jobs

Don't use it

Queus To The Rescue

  • write jobs into message queue
  • have some workers for the queue
  • many services available (ZeroMQ, RabbitMQ, Redis Pub/Sub, Amazon SQS ...)

Good idea but lot of work and hard to get right

Beanstalkd

  • create a job
  • process a job

RQ

  • simple job queue in python
  • backed by redis
  • enqeueu function call
  • run worker how stores result in database

Celery

  • python
  • multiple brokers
  • primarily python with clients for ruby and php
  • highly available (HA)
  • fast
  • flexible
  • monitoring
  • workflows
  • time and rate limits
  • scheduling
  • autoscaling

Search

Like query

  • no lingustic suport
  • no ranking
  • not indexed (mysql has fulltext index and postgresql)

Fulltext Search Engines - Solr Vs. Elastic Search

  • based on lucene
  • full linguistic support
  • faceted search
  • result highlighting
  • HTTP API
  • similar but different
  • easy to set up
  • clients available like PHP: solarium, symfony component or PHP Elastica
  • integrate search into framework (map objects into documents and back) or handeling updates and deletes

NoSQL

  • key-value stores
  • column based
  • document stores
  • graph

Errors

  • don't let it happen
  • send an email (overload mailboxes)
  • log errors
    • php symfony provides a component that you log only all messages if error occures (so only info logging if something happens)
    • properly configure logging
    • use event aggregator
      • sentry
      • errbit

Sentry (event aggregator)

  • started as OSS
  • now available as SaSS
  • udp
  • written in django
  • php client available

Errbit (event aggretator)

  • still OSS

Deployment

  • ftp
  • git pull / app server by ssh
  • but you want a automated deployment like fab deploy
  • automated deployment with webistrano (just a single button)

Wrap Up

  • use background tasks
  • use full text search
  • try to use nosql where needed
  • do error logging
  • think about deployment