Wednesday, December 26, 2007

Back to the Blog ... back to the bug (report)

Today I decide to put up a post about a common requirement which leads to an interesting SQL query (see an example here) using Firebird and Flamerobin, after a few seconds I was writing bug reports :-( see:

[ 1858394 ] Error selecting from mon$statements

and

[ 1858397 ] Error running SELECT query. (Not a Flamerobin bug)

Hope these will be fixed soon.

EDIT: the developers instantly responded to my bug reports saying that the first one is already fixed in SVN and that will be in the next release and the second is not a bug

Well actually bug 1858397 looks like a Firebird bug, unrelated to Flamerobin or IBPP :-( time to close that bug report and post to the right place, which is

[#CORE-1668] Error running SELECT query.

EDIT: looks like this is a duplicate of bug 94, a long standing one that won't be fixed anytime soon :-(

After the ordeal I ended up with

[ 1858592 ] Problem with LIST() function.
List() result is represented as a BLOB so it appears in the grid as a marker and you can't read it's content.
Even in this case the developers quickly clarified the situation:

There are plans to make textual blobs behave just like regular varchar
columns - if not sooner it will be available in version 0.9.0.

Milan.


BTW this query doesn't work:

  1. SELECT
  2. e.emp_no, e.FIRST_NAME, e.LAST_NAME,
  3. CASE
  4. WHEN ep.emp_no || ep.PROJ_ID IS NULL THEN ' didn''t take part in '
  5. ELSE ' took part in '
  6. END,
  7. p.proj_name
  8. FROM
  9. employee e,
  10. project p
  11. LEFT OUTER JOIN EMPLOYEE_PROJECT ep
  12. ON e.EMP_NO = ep.EMP_NO AND p.proj_id = ep.PROJ_ID

(If you change the above to a cross join it works fine).
While this does:

  1. SELECT
  2. a.emp_no, a.FIRST_NAME, a.LAST_NAME,
  3. CASE
  4. WHEN ep.emp_no || ep.PROJ_ID IS NULL THEN ' didn''t take part in '
  5. ELSE ' took part in '
  6. END,
  7. a.proj_name
  8. FROM
  9. (SELECT * FROM employee e, project p) a
  10. LEFT OUTER JOIN
  11. EMPLOYEE_PROJECT ep
  12. ON a.EMP_NO = ep.EMP_NO AND a.proj_id = ep.PROJ_ID;

Saturday, August 04, 2007

Lufthansa ships with ...

For those who are spending their miles at the Lufthansa Worldshop for the first time, they ship with UPS ... I haven't been able to find this out from their site and the tracking code remained rather obscure to me until I asked for clarification.
Their answer was quick and accurate, but, is it so difficult to put a link to the UPS site?

Thursday, June 28, 2007

Gnuwin32, what about the docs????

Today I had to surrender and use grep on a pile of rubbish, er ..., on some Cognos files, those pesky .icr, so I went to the GNUWin32 site, downloaded the grep installer for windows and run, just like any good (windows) boy would do ... and the first surprise came, the gnuwin32/bin dir is not added to path ... anyway, I added it by hand and went on, in hope that a simple grep --help would give me all infos, here it is:
D:\FOLDER>grep --help
Usage: grep [OPTION]... PATTERN [FILE] ...
Search for PATTERN in each FILE or standard input.
Example: grep -i 'hello world' menu.h main.c

So I'd say single quotes are good, but ... this is what worked ... I'm just pissed by this

D:\FOLDER>grep -H "Table :" *.icr > tables_in_catalog.txt

NO single quotes, but DOUBLE quotes!

Wednesday, June 20, 2007

Opensource database comparison

"Which is the best opensource database?", "How does MySQL/PostgreSQL/Firebird ... (you name it) stand against Firebird/PostgreSQL/MySQL ... (you name it again)?" I'm shure we've all heard this questions many times, but now I have the definitive answer, by Jim Starkey himself, you can read the full post here, but in short is

"I think the short answer is that nobody with the experience to do this
is interested in doing it. I know that this isn't a particularly useful
answer, but I'm afraid it is the truth."

GREAT!!!

Sunday, May 27, 2007

Automating mantenance tasks on MySQL

I'm trying to automate some trivial maintenance tasks for my own MySQL server, and trying also to minimize the effort, so ... Here is the recipe:

Take an excellent generalized stored procedure like the one by Konstantin Osipov, see "Dynamic SQL is Stored Procedures" on MySQLForge (example 4).
Tune it a bit so that it takes into account only non system tables (I'm trying to turn it into something similar to Microsoft's sp_MSforeachtable), here is the code:

  1. DELIMITER $$
  2. DROP PROCEDURE IF EXISTS `test`.`sp_4_each_table` $$
  3. CREATE DEFINER=`root`@`localhost` PROCEDURE `sp_4_each_table`(db_name VARCHAR(64), template VARCHAR(21845))
  4. BEGIN
  5. #
  6. DECLARE done INT DEFAULT 0;
  7. #
  8. DECLARE tname VARCHAR(64);
  9. #
  10. DECLARE c CURSOR FOR
  11. #
  12. SELECT table_name FROM information_schema.TABLES WHERE table_schema=db_name AND table_type = 'BASE TABLE';
  13. #
  14. DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done= 1;
  15. #
  16. #
  17. OPEN c;
  18. #
  19. REPEAT
  20. #
  21. FETCH c INTO tname;
  22. #
  23. SET @stmt_text=REPLACE(template, " ?", CONCAT(" ", tname));
  24. #
  25. BEGIN
  26. #
  27. DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
  28. #
  29. SELECT CONCAT("'", @stmt_text, "' command failed") AS "error";
  30. #
  31. PREPARE stmt FROM @stmt_text;
  32. #
  33. EXECUTE stmt;
  34. #
  35. DEALLOCATE prepare stmt;
  36. #
  37. END;
  38. #
  39. UNTIL done END REPEAT;
  40. #
  41. END $$
  42. DELIMITER ;

Now with the procedure at hand I'm going on with the scheduling part, taking advantage of the great new "EVENT" feature of MySQL 5.1, see it in action:
First of all check if events are enabled;

Code:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.18-beta-community-nt-debug MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select @@event_scheduler;
+-------------------+
| @@event_scheduler |
+-------------------+
| OFF |
+-------------------+
1 row in set (0.00 sec)
Add the event to the database, we'll activate events later

Code:
mysql> create event analyze_all
-> on schedule every 1 day
-> starts timestamp '2007-05-27 23:59:59'
-> ends timestamp '2007-05-27 23:59:59' + interval 1 year
-> on completion preserve
-> comment 'updates stats by analyzing tables'
-> do call sp_4_each_table('test', 'analyze table ?');
Query OK, 0 rows affected (0.03 sec)
Check it's existance from the related information_schema table:

Code:
mysql> SELECT
-> event_name,
-> event_type,
-> status
-> FROM information_schema.`EVENTS` E;
+-------------+------------+---------+
| event_name | event_type | status |
+-------------+------------+---------+
| analyze_all | RECURRING | ENABLED |
+-------------+------------+---------+
1 row in set (0.02 sec)
Ok, the event is here.
Now I'll have to enable events on the server and let it rip, it's very simple
just issue an:
Code:
SET GLOBAL event_scheduler=1;
Or start the server with the:

Code:
--event_scheduler=1
Stopping an event without deleting it's also very simple:

Code:
mysql> alter event analyze_all disable;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT
-> event_name,
-> event_type,
-> status
-> FROM information_schema.`EVENTS` E;
+-------------+------------+----------+
| event_name | event_type | status |
+-------------+------------+----------+
| analyze_all | RECURRING | DISABLED |
+-------------+------------+----------+
1 row in set (0.00 sec)

Sooner or later I'll show you how this, with "federated tables", "csv storage engine" and "upserts" can really ease and improve an ETL process.

Featured on MySQL Newsletter June 2007. WOW !!!

Sunday, May 20, 2007

MYSQLDUMP, mind your options!!

In the process of upgrading my desktop installation of MySQL I noticed that, unlike triggers, stored procedures are not automatically dumped, that's a "not so nice" feature of mysqldump, sic, here is a snippet for those who don't have time to check the manual

C:\Documents and Settings\user>mysqldump -uroot --single-transaction --routines
--databases test remote_test > c:/backup_mysql_20_05_07.sql

-- routines does the trick.
--single-transaction is InnoDB specific for a consistent backup (doesn't apply to MyISAM and other table handlers)
--databases is used because I want to backup only a specific number of databases

Note that after MySQL 5.0.11 triggers are dumped automagically (option enabled by default), use --skip-triggers if downgrading your table structure.

Note also that before 5.0.11 triggers were not dumped and routines had to wait till 5.0.13 (well, 5.0.20 for routines with DEFINER ...) so act accordingly!!!

Another trick, you can use mysqldump to copy database structure from one db or server to another this way:

C:\Documents and Settings\pcuser>mysqldump -uusername -ppassword -d sourcedatabase -n | mysql -uusername -ppassword targetdatabase

This is common on *nix, but it works in windows too.
Key is the
-n no create database
parameter

Friday, May 04, 2007

SAP BW, IF for your BEx formulas

Have you ever felt the need for an IF in your BEx formulas? Didn't find it? That's because SAP expects you to know about boolean math ...
See, we are subtracting two quantities and we want our result to be shown as 0 if the actual result is negative, the trick is very simple, look at this example:


( ( 'GR quantity' - 'PO quantity' ) > 0 ) * ( 'GR quantity' - 'PO quantity' )


The first block will evaluate to 1 when GR quantity is greater than PO quantity and to 0 when it's smaller, so the result will be something like:

GR quantity: 6
PO quantity: 4

(( 6 - 1) > 0) * (6 - 4) will turn into 1 (true) * (6 - 4) = 2

The other case:

GR quantity: 4
PO quantity: 7

(( 4 - 7) > 0) * (4 - 7) will turn into 0 (false) * (4 - 7) = 0

Saturday, April 07, 2007

Firebird 2.1 alpha, a quick glance at the new features

After reading about the new Firebird 2.1 alpha I was eager to test it, and here are some sample queries that show how the new features look like, all examples are based on the standard EMPLOYEE.FDB.
The first one is about the MERGE statement, which is useful in many ways (my favourite is synching tables).

  1. MERGE
  2. INTO country c
  3. USING (
  4. SELECT * FROM country
  5. UNION ALL
  6. SELECT 'Maroc' country, 'Dirham' currency FROM rdb$database
  7. ) cd
  8. ON (c.country = cd.country)
  9. WHEN MATCHED THEN
  10. UPDATE SET
  11. country = cd.country,
  12. currency = cd.currency
  13. WHEN NOT MATCHED THEN
  14. INSERT (country, currency)
  15. VALUES (cd.country, cd.currency);

This will end adding a row to the "country" table, but hey, that's merging data and looks more flexible than ... ON DUPLICATE KEY UPDATE ;-)
A simpler, but less effective and non standard IMHO, syntax available is:

  1. UPDATE OR INSERT INTO
  2. COUNTRY (COUNTRY, CURRENCY)
  3. VALUES ('Poland', 'Zloty')
  4. MATCHING (COUNTRY);

And a syntax like "update or insert ... select ... matching ..." doesn't seem to be supported.
Can't say I felt the need for this one, but, the more the merrier.

Now, did you ever feel the need for MySQL's GROUP_CONCAT? Here you have LIST(),
see it in action:

  1. SELECT ep.emp_no, LIST(p.proj_name)
  2. FROM EMPLOYEE_PROJECT ep
  3. RIGHT OUTER JOIN
  4. project p ON ep.PROJ_ID = p.PROJ_ID
  5. GROUP BY emp_no;

Guess what the result will be ...
(There seems to be a little bug in isql when showing results from this kind of queries, but it works as expected in Flamerobin).

You can also take advantage of the new database triggers to track connections with something like this:

A database table to hold connections:

  1. CREATE TABLE connection_tracking
  2. (
  3. user_name varchar(50) NOT NULL,
  4. session_id integer NOT NULL,
  5. started_at timestamp NOT NULL,
  6. ended_at timestamp
  7. );

And a database trigger to record each connection:

  1. SET TERM ^ ;
  2. CREATE TRIGGER CONNTRACKER
  3. ACTIVE
  4. ON CONNECT
  5. POSITION 1
  6. AS
  7. BEGIN
  8. /* enter trigger code here */
  9. INSERT INTO CONNECTION_TRACKING (USER_NAME, SESSION_ID, STARTED_AT)
  10. SELECT
  11. rdb$get_context('SYSTEM', 'CURRENT_USER'),
  12. rdb$get_context('SYSTEM', 'SESSION_ID'),
  13. current_timestamp
  14. FROM rdb$database;
  15. END^
  16. SET TERM ; ^

You'll end up with something like:

SQL> CONNECT "c:\programmi\firebird\firebird_2_0\examples\empbuild\employee.fdb"
user 'SYSDBA' password 'masterkey';
Database: "c:\programmi\firebird\firebird_2_0\examples\empbuild\employee.fdb",
User: SYSDBA
SQL> select * from connection_tracking;

USER_NAME SESSION_ID S
TARTED_AT ENDED_AT
================================================== ============ ================
========= =========================
SYSDBA 8 2007-04-08 13:01
:04.8280

SQL>

You can also define an ON DISCONNECT trigger to update the row with the "ended_at" information.

  1. SET TERM ^ ;
  2. CREATE TRIGGER CONNTRACKER_END
  3. ACTIVE
  4. ON DISCONNECT
  5. POSITION 1
  6. AS
  7. BEGIN
  8. /* enter trigger code here */
  9. UPDATE CONNECTION_TRACKING c
  10. SET c.ended_at = CURRENT_TIMESTAMP
  11. WHERE
  12. c.user_name = rdb$get_context('SYSTEM', 'CURRENT_USER')
  13. AND
  14. c.SESSION_ID = rdb$get_context('SYSTEM', 'SESSION_ID')
  15. AND
  16. c.STARTED_AT = (
  17. SELECT MAX(ct.started_at) FROM CONNECTION_TRACKING ct
  18. WHERE
  19. ct.USER_NAME = c.USER_NAME
  20. AND
  21. ct.SESSION_ID = c.SESSION_ID
  22. );
  23. END^

This is a powerful database auditing feature!

Now a feature that will make life of those who like mimicking autoincrement keys much easier, it's the great INSERT ... RETURNING ..., similar to what PostgreSQL already implemented with success.
Here is an example, it allows you to retrieve the generated value of the "autoincrementing" column in one pass (hope this will be quickly integrated in the php extension for Firebird).

SQL> select * from new_one;

COL1 COL2
============ ==========
1 pluto
2 COL2
3 COL2

SQL> insert into new_one(col2) values ('col3') returning col1;

COL1
============
5

SQL> select * from new_one;

COL1 COL2
============ ==========
1 pluto
2 COL2
3 COL2
5 col3

Example is based on a previous post of mine, you can find it here.

Another nice little thing is the ability to know through SQL which server version are you on:

  1. SELECT RDB$GET_CONTEXT('SYSTEM', 'ENGINE_VERSION')
  2. FROM RDB$DATABASE;

which will result in

2.1.0

Hooray!!!

Friday, March 30, 2007

Being a mod means

This time I turned the kind gift of the Devshed guys into:

"Financial Intelligence" an interesting book which I hope will help in my ever evolving job, I really need to understand what financial people are saying and what they really mean.



This one looks promising, it's a side of my professional life that I've never really dug into, but it's very interesting (and somehow fun) .





Saturday, February 24, 2007

Another handy MySQL function

This time it's a quick function to validate email addresses, based on regexp.
It can be used in a trigger to add data validation, or to check data already in your database that needs a clean up ... really a simple wrapper around a simple regexp query, but it can be helpful.
Here it is:

  1. DELIMITER $$
  2. DROP FUNCTION IF EXISTS `test`.`is_valid_email` $$
  3. CREATE DEFINER=`root`@`localhost` FUNCTION `is_valid_email`(p_email varchar(64)) RETURNS tinyint(1)
  4. BEGIN
  5. CASE
  6. WHEN NOT (SELECT p_email REGEXP '^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$')
  7. THEN
  8. -- bad data
  9. RETURN FALSE;
  10. ELSE
  11. -- good email
  12. RETURN TRUE;
  13. END CASE;
  14. END $$
  15. DELIMITER ;

As per Mushu's comment, this is much cleaner, oops:

  1. DELIMITER $$
  2. DROP FUNCTION IF EXISTS `test`.`is_email_valid` $$
  3. CREATE FUNCTION `test`.`is_email_valid` (p_email varchar(64)) RETURNS tinyint(1)
  4. BEGIN
  5. CASE
  6. WHEN p_email REGEXP '^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$'
  7. THEN
  8. -- good email
  9. RETURN TRUE;
  10. ELSE
  11. -- bad data
  12. RETURN FALSE;
  13. END CASE;
  14. END $$
  15. DELIMITER ;


Also, if you can move such checks into the application code, it may be a wise thing to do as executing regular expressions can easly consume all CPU resources.

Of course doing checks in the application code is cleaner, but the ability to add them later at database level, or to run them in the database without interaction with external apps is also handy ;-)

Wednesday, February 14, 2007

Missing DLLs when starting GIMP 2.3.x

I've just downloaded and installed Gimp 2.3.14 for windows and noticed what seems to be a widespread problem, startup errors due to missing dlls. Of the many solutions listed here I found out that:
1. Changing the shortcut icon on the desktop (point 5 in the page linked)
2. Changing the shortcut icon in Start -> Programs -> GIMP -> Gimp 2


3. Adding a registry key (point 6 in the page linked)


solved my problem.
Hope this helps.

Sunday, February 11, 2007

MySQL spanning different timezones

So what, if you are planning a centralized MySQL server and many clients all around the world?
First of all, check the manual
Then install timezone description tables from here
Then run a small example to see how the server and client timezones can interact.
Here is the standard server setup

mysql> show variables like 'time_zone';
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| time_zone | SYSTEM |
+---------------+--------+
1 row in set (0.00 sec)

Now add a table for our experiments

mysql> create table time_table (datetime_col datetime);
Query OK, 0 rows affected (0.11 sec)

Let's add some data in that table.
Note that I'm not setting any option for the client time_zone

mysql> insert into time_table (datetime_col) values (now());
Query OK, 1 row affected (0.05 sec)

mysql> select * from time_table;
+---------------------+
| datetime_col |
+---------------------+
| 2007-02-11 13:57:00 |
+---------------------+
1 row in set (0.00 sec)

Ok, that's the current time (I'm connecting from the command line to a server running on the same PC), now I'll set the client timezone, on the same connection!

mysql> set time_zone = 'Europe/London';
Query OK, 0 rows affected (0.01 sec)

mysql> insert into time_table (datetime_col) values (now());
Query OK, 1 row affected (0.01 sec)

mysql> select * from time_table;
+---------------------+
| datetime_col |
+---------------------+
| 2007-02-11 13:57:00 |
| 2007-02-11 12:57:28 |
+---------------------+
2 rows in set (0.00 sec)

Hey I just set a timezone and voilà, time travelling ;)
Now I'm connecting from another client, no special timezone setting

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.33 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use test;
Database changed
mysql> insert into time_table (datetime_col) values (now());
Query OK, 1 row affected (0.02 sec)

mysql>

Back to the first connection:

mysql> select * from time_table;
+---------------------+
| datetime_col |
+---------------------+
| 2007-02-11 13:57:00 |
| 2007-02-11 12:57:28 |
| 2007-02-11 13:58:32 |
+---------------------+
3 rows in set (0.00 sec)

So I really set the timezone for a specific connection!!
This is handy, but could bite, see

mysql> select * from time_table order by datetime_col desc;
+---------------------+
| datetime_col |
+---------------------+
| 2007-02-11 13:58:32 |
| 2007-02-11 13:57:00 |
| 2007-02-11 12:57:28 |
+---------------------+
3 rows in set (0.02 sec)

Rows were NOT inserted in this order!
But when the remote client loads that row will shurely want to see date and time in his own terms.

I know of the CONVERT_TZ() function, but I don't see it useful in this case if a "datetime with timezone" or "timestamp with timezone" data type is not supported (PostgreSQL does support it), I know I could store in another column the timezone but it's not a really practical solution.
Can anyone shed any light on this?

Saturday, January 27, 2007

Powerline, the good, the bad and the ugly ...

After much esitance I decided to jump on the bandwagon of Powerline (a.k.a. Homeplug), that's it, ethernet over power!
At first it sounds great, I mean, no cabling, no pesky radiations going round like with WiFi and so on ... an easy way of sharing internet connection for my 2 PCs at home.
I tested WiFi 802.11b before and wasn't impressed with it's performance, so I reached my purse and got a pair of Powerline adaptors, plugged them in and ... voilà, instant networking [the good].
This looked extremely good, easier than WiFi, but after some usage I noticed that it was a quite slow connection, so, unfortunately, I took a peak at the manual and discovered that max theoretical (including channel control data) speed is 85Mbit/s, while effective maximum network speed can't exceed 25Mbit/s :-( [the bad (1)] and that you need to install more software/drivers in order to add security to your network, much like WEP for WiFi, which adds to complication and is likely to subtract network speed [the bad (2)].
Network speed was unimpressive, so I run a quick test with JGaa's NetCPS and here are the results for you to read, as seen from both sides


C:\>netcps -s

NetCPS 1.0 - Entering server mode. Press ^C to quit

Waiting for new connection...

Client connected from 192.168.1.9

---> CPS 1519616.00 KPS: 1484.00 MPS: 1.45

Avrg CPS 1515546.00 KPS: 1480.03 MPS: 1.45

Peek CPS 1542144.00 KPS: 1506.00 MPS: 1.47

Client disconnected. 104857600 Kb transferred in 69.19 seconds.



C:\>netcps 192.168.1.5

NetCPS 1.0 - Entering client mode. Press ^C to quit

Connecting to 192.168.1.5 port 4455... Connected!

---> CPS 1518097.88 KPS: 1482.52 MPS: 1.45

Avrg CPS 1515524.13 KPS: 1480.00 MPS: 1.45

Peek CPS 1553373.25 KPS: 1516.97 MPS: 1.48

Done. 104857600 Kb transferred in 69.19 seconds.


I'd not say very fast ...
You read about [the good], [the bad] and here is [the ugly]



(when I'll be able to retrieve a photo from my mobile) Ok, I've made it

the plug is very light, but quite big, or at least bigger than what I expected.
After all this could seem a bashing post, so I'm adding the results of the same test under 802.11b

C:\>netcps -s

NetCPS 1.0 - Entering server mode. Press ^C to quit

Waiting for new connection...

Client connected from 192.168.1.10

---> CPS 495616.00 KPS: 484.00 MPS: 0.47

Avrg CPS 487532.91 KPS: 476.11 MPS: 0.46

Peek CPS 526336.00 KPS: 514.00 MPS: 0.50

Client disconnected. 104857600 Kb transferred in 215.08 seconds.

As you can see homeplug is much faster than 802.11b, don't have a more recent 802.11x device to compare :-(

Monday, January 22, 2007

Signs of MySQL usage at BBC.CO.UK

A very interesting document from BBC's Standards and Guidelines (Database Design and Development Standards v1.00) mentions only two databases, guess what? Oracle and MySQL!!!
BTW the document is also a source of good design tips like:

Normally, use only alphanumeric characters plus the underscore (_) character for table names.

MySQL
Table names are case sensitive under UNIX (or Linux). To avoid confusion, define/declare table names in upper case in any database creation SQL scripts.

No SQL language reserved words should be used as column names

Identify primary, unique, and foreign keys (to enforce important or critical data integrity) as part of the initial design process

To guard against slow queries degrading performance on the shared database infrastructure, the most commonly performed SQL queries should be analysed with an explain plan. This should use sufficient data to generate representative query execution plans

Indexes should be used where appropriate, use explain/explain_plan to demonstrate

All database APIs must define error handling, and success/failure notification

Designs should be as RDBMS agnostic as possible: for example, minimize use of sets and abstract data types, unless these are key to the design. It is accepted that that there will inevitably be some RDBMS specific data typing

MySQL offers a number of storage types. The only ones that should be used are:
MyIsam – for general (especially read only) data.
InnoDB – for transactional data, or data with important referential integrity.


Any database data loading processes MUST create a summary log, an error file (in the event of any errors, and a bad file for records that failed to load.)

This reminds me of a feature request I made for a bad file option for LOAD DATA (bug 11480)

Database users should only be granted sufficient privileges to enable them to perform the tasks that they need to

... and many more.

I think that such a document is of general interest and very useful expecially to newbies.

Sunday, January 21, 2007

Playing with transactions on MySQL

Here is my attempt, failed as per the video quality, at showing a small example of transactions usage on MySQL 5.

Thursday, January 04, 2007

Falcon, the new MySQL Storage Engine

While taking a quick look at the newly released docs for Falcon, the new storage engine for the MySQL database from MySQL AB (and this is important, as actually InnoDB and the now ditched BDB storage engines for MySQL database are from Oracle) I noticed two "strange" things, at least to someone thinking of large databases running on dedicated hardware.
First of all the data storage, chapter 14.6.5.1. Falcon data file and data structures reads:

A single Falcon database file stores all record data, indexes, database structure and other information. ...

This doesn't sound like a good idea for me as it's better to keep indexes and data separated on different disks and something like Oracle's implementation of tablespaces and the ability to assign indexes and data to different tablespaces hosted on different disks is important (I noticed that in some cases up to 40% of the size of my datawarehouse is made up of indexes).

Then data compression, chapter 14.6.5.6. Data Compression reads:

Data stored in the Falcon tablespace is compressed on disk, but is stored in an uncompressed format in memory. Compression occurs automatically when data is committed to disk.

won't this compress/uncompress loop badly affect bulk load speed and concurrency (forced writes ...)?
Hope the engine will be made available in binaries quickly and proper benchmarks will clear any doubt.