Sunday, July 18, 2010

sed strip unicode out of file

e2 80 8b is the hex utf8 for unicode code point U+200b
sed -e "s/\xe2\x80\x8b//g" input.u8 >output.u8

Tuesday, July 13, 2010

mysql union

mysql> (select 'a') union (select 'a');
+---+
| a |
+---+
| a |
+---+
1 row in set (0.00 sec)

mysql> (select 'a') union all (select 'a');
+---+
| a |
+---+
| a |
| a |
+---+
2 rows in set (0.00 sec)

mysql> (select 'a') union distinct (select 'a');
+---+
| a |
+---+
| a |
+---+
1 rows in set (0.00 sec)