CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Importing European names from Excel (UTF8)
Pages: [1]

Author Topic: Importing European names from Excel (UTF8)  (Read 7988 times)

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Importing European names from Excel (UTF8)
November 05, 2010, 07:27:46 am
Hi,

An old issue but an important one where it'd be great for us to document a definitive answer. Importing sheets from Excel where there are European names can be a nightmare, and it really shouldn't be a difficult task needing any technical expertise.

Here's a step-by-step that works:

Here's the simplest and best solution/workaround I can find for now, and it's pretty simple: You create a macro button, copy & paste in the code below, then click on it. Step-by-step instructions are below if you don't know macros.

Good tip: create a blank Excel sheet with a macro button like this, save it somewhere easy for everyone to get at (e.g. on a CiviCRM sidebar) and ask staff to just copy and paste any list they want to import into this file, then hit the button. This method works for all UTF 8 languages - I tried it successfully with characters from Russian Cyrillic, Japanese, Polish and Arabic.

These instructions are now also on the wiki here: http://wiki.civicrm.org/confluence/display/CRMDOC33/Making+Microsoft+Excel+give+you+clean+data+to+import

Creating the macro (Excel 2007)
---------------------------
Making a macro for beginners: (macro code is at the end)
1) Create a shape for the button. Go to 'Insert' tab, 'Shape', pick any shape, put it anywhere where you can click on it.
2) Right click on the shape, 'Assign Macro', then hit the 'New' button.
3) It'll come up with a code screen. Copy and paste the code below into the place it puts the cursor (between 'Sub ...' and 'End Sub'). That's it!
4) Close the code screen completely (you don't need to save), click off the shape to deselect it. Then click on it and it should just work!

You can copy and paste this macro button into any other excel sheet you need to export - just right-click on it and copy and paste as normal. If you need to move it around, first right-click on it to select it without triggering the macro, then you can drag it about as normal.

It might not work on older versions of Excel. If the code gives an error, make sure you copied everything and no other non-code text, and make sure you're pasting the code before 'End Sub' and after 'Sub SomeShapeX_Click()' (e.g. it might be Sub SmileyFace3_Click() or Sub Rectangle1_Click() ). This is where it puts you by default.

The macro code to copy and paste:

Code: [Select]


Dim SrcRg As Range
Dim CurrRow As Range
Dim CurrCell As Range
Dim CurrTextStr As String
Dim ListSep As String
Dim FName As Variant

FName = Application.GetSaveAsFilename("", "CSV File (*.csv), *.csv")
If FName <> False Then
ListSep = Chr(44)
If Selection.Cells.Count > 1 Then
Set SrcRg = Selection
Else
Set SrcRg = ActiveSheet.UsedRange
End If

Set fsT = CreateObject("ADODB.Stream")
fsT.Type = 2
fsT.Charset = "utf-8"
fsT.Open

For Each CurrRow In SrcRg.Rows
CurrTextStr = ""
For Each CurrCell In CurrRow.Cells
If CurrCell.Value <> Empty Then
EscapedQuotes = Replace(CurrCell.Value, Chr(34), Chr(34) & Chr(34))
CurrTextStr = CurrTextStr & """" & EscapedQuotes & """" & ListSep
Else
CurrTextStr = CurrTextStr & ListSep
End If
Next
fsT.writetext CurrTextStr, 1
If (fsT.Position < fsT.EOS) Then
   fsT.SetEOS = fsT.Position
End If
Next
End If

fsT.SaveToFile FName, 2



Technical notes: This macro does 3 things:
  • It surrounds every non-blank cell in quotes, after "" escaping any quotes already there, to avoid CiviCRM/PHP's problem of ignoring non-Ascii accented characters at the start of a name, see http://issues.civicrm.org/jira/browse/CRM-345) (adapted from http://www.mrexcel.com/forum/showthread.php?t=320531)
  • It forces the CSV file to use commas regardless of settings, avoiding any continental issues of semi-colons being used to delimit the CSV file, see Xavier's comments below
  • It formats as a CSV file, and saves it (in a round-about way) as UTF8 using the "ADODB.Stream" object (adapted from http://www.josephtbradley.com/wp/2007/10/17/saving-as-utf-8-with-vba-in-excel-and-wrapping-text-with-no-orphans/ )
« Last Edit: January 07, 2011, 03:29:24 am by alanms »

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing European names from Excel (UTF8)
November 05, 2010, 08:16:50 am
Hi,

Great start!

Could you create an account in the wiki and post it there ?


The only think I'd add is that excel is using either "," or ";" as a column separator, based on it's version and region (eg. french is ";" english is "," if I recall, and hence never be able to import 100% of the time...

Another workaround worthwhile suggesting it to use openoffice (that has a filter so you can define unicode utf-8 as the right format).

how, and explain that it isn't civicrm fault if dear excel still uses crappy encoding that the rest of the world has happily replaced by utf-8, like any sane person should do...

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Re: Importing European names from Excel (UTF8)
November 05, 2010, 09:16:10 am
Cheers, so just clarifying does this mean Civicrm requires "," seperated CSV at all times including when localized to France? If that's the case it looks like it'd be easy to force this macro to use commas regardless, off the top of my head it looks like changing

Code: [Select]
ListSep = Application.International(xlListSeparator)
to

Code: [Select]
ListSep = ","
should do it.

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Re: Importing European names from Excel (UTF8)
November 08, 2010, 03:55:26 am
EDIT:

Fixed it above so it forces Excel to always use "," and never ";" regardless of setting (though I sadly can't go to France to test it fully... unless someone wants to cover my Eurostar fare? :) ), and I've also fixed a peculiarity where Civi was ignoring backslash escape characters in the import (it now uses double-double quotes to escape double quotes).
« Last Edit: November 08, 2010, 04:44:14 am by alanms »

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing European names from Excel (UTF8)
November 08, 2010, 09:56:26 am
I just found phpexcel. They seem to be able to read directly.

Have you try it by any chance ? Shouldn't be too hard to add as a replacement of csv
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Re: Importing European names from Excel (UTF8)
November 08, 2010, 06:57:23 pm
Hmm, that's interesting. In the long term I'm sure a lot of organisations would love to further enable staff to import XLS files directly without needing to think about anything technical.

For this, I was literally just looking for a reliable way to get sheets up that works right now - but I could imagine this being a very popular feature. Right now reading between the lines of forum posts it looks like lots of orgs lean more towards having a resident techie responsible for all imports, rather than having non-techie project and fundraising staff importing and exporting their own lists with confidence.

Making it easy for any staff member to casually export a list, take it to an event on the office laptop, update it based on who shows up, then stick it back up again would be a big selling point - though I think this would need a broader usability overhaul in the import and export interface and error handling. Definitely one to think about!

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Re: Importing European names from Excel (UTF8)
November 10, 2010, 12:06:13 pm
Quick update - I fixed this to work properly for all UTF-8 characters (I've updated the code and the original post). It now lets you save real UTF 8 csv files in Excel directly, and it correctly exports all UTF8 characters I tested to the csv, including Japanese, Arabic etc. :)

Quick note - oddly, some of the utf8 characters from this were displaying as boxes in my text edito, even though they worked fine in test Civi imports and even displayed properly in ol' Notepad. Probably means the text editor needs some character support add-on.

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing European names from Excel (UTF8)
November 10, 2010, 12:35:08 pm
Hi,

I've toyed with a js/flash solution. Work fine for me (plugged it on export, needs a 3.3. version)

http://github.com/tttp/civi_export

-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

alanms

  • I post occasionally
  • **
  • Posts: 72
  • Karma: 5
Re: Importing European names from Excel (UTF8)
November 10, 2010, 01:22:03 pm
Looks good - it'd be great to avoid having to explain how to add UTF-8 byte order markers when people are puzzled that the sheet they exported from Civi and opened in Excel has made a mess of all their European names!

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing European names from Excel (UTF8)
November 12, 2010, 09:15:06 pm
http://phpexcel.codeplex.com/ allows to write and read excel files from php (not the most innovative product naming there is, but for once what it does it written on the box ;)

It's LGPL.

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

Traveler777

  • I post occasionally
  • **
  • Posts: 43
  • Karma: 0
  • CiviCRM version: 3.3.2
  • CMS version: Drupal 6.2
Re: Importing European names from Excel (UTF8)
December 13, 2011, 07:46:39 pm
Exactly what I was looking for. Can anyone help with a "Run-time error 1004" error message? Method "GetSaveAsFilename of object_Application" failed. The following section in the code is highlighted when debugging: "FName = Application.GetSaveAsFilename("", "CSV File (*.csv), *.csv")"

I am using Office 2011 for Mac.

Thank you for any help!

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Importing European names from Excel (UTF8)

This forum was archived on 2017-11-26.