CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion »
  • Google Summer of Code »
  • Email Preview Cluster
Pages: [1] 2 3

Author Topic: Email Preview Cluster  (Read 5384 times)

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Email Preview Cluster
March 21, 2015, 10:38:50 am
Hi, I am Utkarsh Sharma, third year undergraduate student from IIT Bombay. I am looking forward to getting involved in the project on Email Preview Cluster. I am really excited to contribute to CiviCRM and the open source community by working on this project as almost all of the email preview and testing softwares like Litmus and Mailchimp are paid.
I looked up, how these softwares work and I found out that one of these softwares 'Email on Acid' sends the email through each email client application and parses together a screen capture of the final result. Do we plan to implement our project this way or we just use the free APIs that we have, from putsmail (which is written in Ruby) and build our GUI around that?

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
March 21, 2015, 03:01:35 pm
Hello, Utkarsh. I think your exactly quite right about what the options are -- either send the email to each client and parse a screen capture, or use an API.

I hadn't heard of Putsmail. A free API sounds good -- do you have any links or details? When one visits putsmail.com, it seems that they've been acquired by Litmus, and I don't see any API docs.

Both approaches -- doing screen-capture or using an API -- can be tremendously valuable, but they would push the project in very different directions. As a few trade-offs:

 * API Client / Frontend: This would focus more on updating the CiviMail user-interface -- ie creating new buttons and windows for requesting and browsing the previews. The cool thing about this: you get to work with frontend code (AngularJS/JS/CSS); also, the code becomes part of the main CiviCRM app, can go "live" sooner, and works with many different email clients. However, it can only work for organizations that can pay for API access to the chosen service (unless we can confirm there's a free API).

  - Aside: If you go this route, check http://forum.civicrm.org/index.php/topic,35794.0.html

 * Screen-capture / Backend: This should probably focus on the doing the backend -- in essence, you make the API which performs the screen-capture. To keep this manageable, we could focus on web-mail systems (Gmail, Yahoo Mail, Outlook.com) and use a tool like (like http://webdriver.io/ or http://www.seleniumhq.org/). There are a few cool things about this: you're mostly writing new code, can think about queueing/scalability issues, and can use almost any framework (NodeJS, Symfony, etc) - and the result can be used by Civi and by other projects. However, there might not be enough time to work on a frontend, so someone else will need to pick up that piece.

  - Aside: If you go this route, then I'd propose a different exercise first. Install NodeJS and follow the guide from webdriver.io to get started. Then adapt the example to save a screenshot - and maybe adapt it to login to gmail.com.

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
March 22, 2015, 06:10:24 am
Thanks for your reply, Tim! I found out about putsmail from the CiviCRM Ideas page itself. This is what I found, https://github.com/w3geekery/putsmail.com.

I also found this when I was looking for free APIs, but I can't tell if this will prove of any use:
https://github.com/magento-hackathon/E-MailPreview/tree/944a115f4640ec4d78386b19d28900be243164b5
Can you tell me if we can use any of this for the project?

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
March 22, 2015, 03:01:16 pm
Both of those are solving different problems. I'll add a link which I think is more on-point.

RE: https://github.com/w3geekery/putsmail.com

When you visit putsmail.com, there's a tagline near the top which says: "Send your HTML to any email address for design testing and debugging". Then, there are form fields where you can enter in some recipient email addresses and some HTML.

This tool looks like it's intended for web-designers who usually write HTML (for the web) but don't know how to send an HTML email. (Most personal email tools, like Gmail or Mac Mail, don't let you type in HTML directly.) So a designer can use this form to send himself a test email.

This is a problem that CiviMail already addresses -- i.e. when you go to "Mailing => New Mailing", you can compose markup (using rich-text or source-code) and then send a test email to yourself. You can see what I mean at https://civicrm.org/sites/civicrm.org/files/blog/CiviMail-Preview.png or by logging into http://d46.demo.civicrm.org/ . (Note: To prevent spam, the demo sites won't generate real emails. But you can use the UI.)

On the chance that Putsmail really does what we want (in some hidden/undocumented way), I downloaded the code and searched ("grep -r") for various phrases that would be used in managing email screen-captures across many clients -- e.g. "screen", "capture", "shot", "image", "png", "jpg", "selenium", "webdriver", "capybara", "worker", "queue", "task", "job". Nothing useful showed up. :(

RE: https://github.com/magento-hackathon/E-MailPreview/

Looking through the code, a couple central pieces appear to be:

 * https://github.com/magento-hackathon/E-MailPreview/blob/master/app/code/community/Hackathon/EmailPreview/Model/EmailPreview.php
 * https://github.com/magento-hackathon/E-MailPreview/tree/master/app/code/community/Hackathon/EmailPreview/Model/Mail/Type

"Previewing" in this context seems to deal with variable substitution -- e.g. if you have an email template with the phrase "<p>Hello, {contact.first_name}</p>", then you might want to see what it looks like when a first name is actually plugged ("<p>Hello, Utkarsh</p>"). If you have a dozen different templates, you might have different variables to plugin to each one. This substitution is useful -- sometimes you spell the variable incorrectly or reference missing data, and a preview will help identify that.

CiviMail does this kind of previewing already -- basically any of the existing preview buttons in https://civicrm.org/sites/civicrm.org/files/blog/CiviMail-Preview.png will perform variable-substitution.

However, in both the magenta-hackathon/EmailPreview and in CiviMail, the preview is displayed in the current web-browser. The problem is that different email clients have their own embedded web-browsers with funky rendering rules - and each may show the email differently.

RE: http://previewmyemail.com/features/api/ , http://previewmyemail.com/resources/help

I think this is the most suitable API I've seen -- e.g. you can call the "CreatePreview" API with options like:

Code: [Select]
CreatePreview:
  * Email Subject: 'Hello world in green'
  * Email Body: '<body style="background:green">Hello, world</body>'
  * Target Apps: Gmail, Microsoft Outlook, iOS Mail.

Then, you can call the "FetchPreview" API, and it will return a list of screenshots. If you look at the screenshots, you might see that each email app uses a different font and treats the green differently (e.g. one app might set the entire background green; another might have a small area with green background; another might ignore the green style).

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
March 23, 2015, 12:16:51 am
Thanks. This really helps!
I went and played around with CiviCRM mailing, just to get the feel of it.
I am getting used to WebdriverIO and that gave me a good insight into what my project might be.
I still had a few questions before I can start with my proposal, though.
Does the project involve using this previewmymail API or are we going to write the code for it? I saw this one was paid too.

My thoughts and interpretation of what happens if we choose either one of the two options:

If we use their API:
We'll only have to work on integrating their code with CiviCRM's and then make a Preview button which will call the CreatePreview API and then write some additional code to call the FetchPreview API and then show the screenshots to the user. But, as you said, this will only work for organizations that can pay for the API. Also, if we actually are going to use a paid API, why not use Litmus, which is the most popular one?

If we write the code from the scratch:
I think even if we go with this option, we'll have enough time to work on the GUI as it doesn't seem like a big task as compared to the back-end task. A bulk of the time is going to be spent on writing the two APIs. And, as per your suggestion, we'll keep it to web-mail systems, to keep it manageable and webdriver will help us there. I strongly feel we can include both, creating the API as well as the GUI in the project and it can still go live in a good time. :)
I wanted your opinion on if it is achievable, or if we can do even better. If not, then how far should the project go?

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
March 25, 2015, 12:53:00 pm
Hi, this is what I think I'll be doing in the project:
In CiviMail, there already are two Preview buttons, namely 'Preview as HTML' and 'Preview as Plaintext'. I'll be adding another button 'Preview in different Email Clients' (or something like that) there, which when clicked, should display the user the screenshots of what the email would look like when opened in the said email clients. I'll achieve this by using Selenium Webdriver to write a code (in JavaScript), which would:
1. Pick up the HTML version of the mail.
2. Send it to predefined email IDs on different Email clients
3. Log in to these predefined email IDs one by one.
4. Search for the mail and open it.
5. Take screenshot and return these images.
I'd then have to display these images side by side. I can display it on a new tab or on the same tab, like facebook does in "light box", using a jquery popup.

Now, is that what you guys are looking for from this project?

Also, I'd like to list down a few downsides of this approach:
1. This can only be used for web-mail services, like Tim pointed out.
2. Steps 2 to 5 might take a lot of time, ranging from 20 to 50 seconds per email client, or maybe more. It takes a little time for the mail to be delivered and even more time is taken in logging into Gmail and YahooMail and other things.

Do you have any ideas regarding how we can make it better? Also, I'd love have any other inputs that you might like to give.

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
March 25, 2015, 04:16:12 pm
Quote
I'd then have to display these images side by side. I can display it on a new tab or on the same tab, like facebook does in "light box", using a jquery popup.

Yeah, that sounds great. :)

I wouldn't get too worked up on making the layout perfect right now - the UX will probably evolve as you get more comfortable with the tools and ideas.

Quote
2. Steps 2 to 5 might take a lot of time, ranging from 20 to 50 seconds per email client, or maybe more. It takes a little time for the mail to be delivered and even more time is taken in logging into Gmail and YahooMail and other things.

This is a really good point, and it definitely affects the design of the UX. Some kind of a status indicator would be quite useful (so that the user knows what's complete and maybe make a guess at how long they'll be waiting).

A big question will be how to break the project into phases. Tackling the full system may be possible - and it's great ambition. :) But software projects are often harder than one expects, esp. when learning to work with a new toolset, so it's good to think a bit about contingencies:

 * Best case scenario: Build a fully open-sourced email screencapture system for webmails, including frontend and backend.
 * Worst case scenario: Produce several incomplete, undocumented pieces that are hard to setup and don't work together.
 * Middle-case scenario: Produce one or two strong pieces (with some docs/testing) - but not a full system.

To try to make the time/risk more concrete, consider the attached diagram of a full service. Each tier will come with its own set of tools and issues (e.g. in tier 1, there's AngularJS+Civi; in tier 3, there's Webdriver+Selenium; in tier 4, there's diverse markup/login-steps for each webmail system). An extremely optimistic estimate would be one week per tier (4 weeks total) -- but in reality we will have several obstacles (learning new tools, unforeseen design problems, bug-fixing, etc.), so it could easily become 4 weeks per tier (16 weeks total).

I would suggest a few activities you might want to work on / think about as you develop a proposal for GSoC:
 - Sketch out the messages that need to pass between components. Do they use HTTP GETs or POSTs? What fields need to pass between each tier?
 - Do an experiment with one or two unfamiliar tools -- e.g. install CiviCRM from git, or maybe write a Webdriver/NodeJS script to login to a website. Send the code.
 - Think about the order of development. Which tier would you write first? Why? How do you know when a tier is "good enough" or "complete"? (Note: There's no wrong answer to that. Ask 3 people, and you'll get 3 different answers. The idea is form some reasons and goals.)

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Email Preview Cluster
March 26, 2015, 01:38:59 am
Quick reminder, not sure if the student(s) interested have done it already for this project, but you have to register and submit the proposal before tomorrow

http://forum.civicrm.org/index.php/topic,36143.0.html

If it's done already, all good, we'll discuss internally and with you and let you learn better our development workflow and tools we use and give you a chance to mingle with the community
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
March 26, 2015, 01:01:49 pm
Here is my proposal for the project and it'd be great to have your feedback on it. :)

PROJECT INFORMATION

Which project idea sparks your interest and why?

The project on Email Preview Cluster interests me the most.
Rendering of emails is pretty inconsistent across major email clients. This means, a user's email may be displayed poorly when opened in some of the clients while it renders well in others. This might prove a major setback in an organisation's email campaigns and hence, it would be nice to look at how your email will look like in the receiver's mailbox before you send it.
With respect to that, this project gives me an opportunity to contribute to CiviCRM and moreover, the open source community, as almost all of the email preview and testing softwares like Litmus and Mailchimp are paid.
Another factor that makes me choose this project is, it involves working with modern frameworks such as AngularJS and NodeJS and it's requirements sit in perfect alignment with my coding skill-set.

Treating this project as a real proposal, provide your implementation plan with as much detail as possible such as weekly time breakdowns, methods of mentor communication, project management, and when to expect specific results/deliverables.

I can use IRC, Gmail or Hangouts to communicate with the mentor. I plan to keep my mentor informed about my progress and plans every week or a half. I'll be sending my mentor an update at the start of every week, which would describe the work done in the previous week, the obstacles I faced, my plans for the coming week and where the project stands as a whole. I will also not hesitate in seeking my mentor's help midweek, in case I fall into trouble. I plan to post the above mentioned start-of-week update, along with a midweek update on the forum for everybody to see and contribute.

Detailed Description (only if this is a new idea, otherwise if there is a description online, please provide the URL):

Email Preview Cluster involves displaying how an email would look like to receivers who'll be using different email clients. So, we'll need to do exactly that, open the mail in different email clients for the user, take screenshots and display these pictures on his screen.
In CiviMail, there already are two Preview buttons, namely 'Preview as HTML' and 'Preview as Plaintext'. We'll have another button 'Preview in different Email Clients' (or something like that) there, which when clicked, should display the user the screenshots of the email when opened in the said email clients. I'll achieve this by using Selenium Webdriver, based on NodeJS, to write a code (in JavaScript), which would:
1. Pick up the HTML code of the mail, along with the subject.
2. Send it to predefined email IDs on different Email clients
3. Log in to these predefined email IDs one by one.
4. Search for the mail and open it.
5. Take screenshot and return these images.
We'll then have to display these images as a cluster on the user's screen.
Breaking it down into four parts, as suggested by Tim:
TIER 1:
CiviMail page on the user's CiviCRM website.
A request will be created when the user clicks on the Preview button. The HTML code of the email and it's subject would be sent forward as part of the request. This tier would be expecting a response from the server to redirect it to display an image cluster as the final output. Building the GUI would be the last step of this project and will be pursued only if the rest of the project is over. If required, it can be finished off after the allotted time. This tier can be deemed "good enough" if we can pass the required fields over to the next tier and take the final output back in the form of images properly.
TIER 2:
The request handling by the organisation's server.
When a request comes the way of the server, it starts processing it once the other pending requests have been responded to. It may involve using the HTTP GET and POST methods and might bring with it a Uniform Resource Identifier (URI). This helps the server in identifying the kind of request being made and in our case, calls the selenium webdriver to execute the request. The request brings with it the inputs (HTML code, subject of the email) which the selenium webdriver uses to performs it's operations and provides the screenshots as output. The server then generates a response and sends it back to wherever the request came from.
TIER 3:
Selenium-Webdriver (or Selenium 2.0) for Node.
This is the part the project hinges on. In this part the job at the front of the queue on the server is processed and selenium webdriver takes over and executes the above mentioned steps returning the screenshots as output. There are standard javascript codes for logging onto a website, sending mails, taking screenshots. We'll have to figure out what GET and POST requests are to be sent or what exact protocol is to be used to communicate with each email website and then use the available codes to fulfill our purpose.
This tier will execute each of the above mentioned 5 steps one by one and keep a log of what amount of work is done and relay this information back to the first tier, so the user can know how much longer he needs to wait. This however is a part of the GUI and will be tended to after the rest of the work is done.

TIER 4:
The Email Clients/Actual execution of the job.
This is the part where the real work will be done. The selenium webdriver code will take care of whatever is done here. Requests will be sent to the browser to perform all the steps and the screenshots will be relayed back at the end.
I've divided this tier further. We can first pick up one email client (say Gmail) and work on it and once we manage to execute all these steps in succession- take HTML email and send it, log in, open mail, take a screen shot and fetch it back for Gmail, this part can be considered as working and can be extended to other webmail clients in no time.

I plan to execute tier three and four first. I first intend to write the selenium code for one email client. Once this works smoothly, I'll be working on the communication between tier one and two. I will work on creating a request, forwarding it to the server. The images that the server sends will be taken back as input on the CiviCRM page and displayed to the user. The next part will be joining these two together and setting up a flow of requests and data between the CiviMail page and the email client. Once this is done we'll have a working prototype of the project. Making a click would result in us getting back the screenshot. After this we can extend this model to other webmail clients. Then finally we can work on building the GUI.

Expected Deliverables: (list the main items that you will deliver be during the program):
Item 1 - 23 June
    Midterm Submission: Model for TIER 3 and 4
    Blog post about the Midterm Submission
Item 2 - 07 July
    Alpha Version
    Blog post about the Alpha version
Item 3 - 28 July
    Beta Version
    Blog post about the Beta version
Item 4 - 18 Aug
    Code will be made available on Github
    Final blogpost about road maps to incorporating the project with CiviCRM
    Detailed Documentation of the project

Timeline (break down by every week of GSoC):
12 May - 25 May:Research Weeks 1 and 2
* Get up to speed with NodeJS
* Get up to speed with AngularJS
* Get up to speed with Selenium-Webdriver
* Get up to speed on how CiviCRM servers handle requests
* Try using Litmus and other paid Email Preview Services, to get a better understanding of how they work
* Look into the the diverse markup/log-in steps for Gmail

26 May - 01 June: TIER 3/4- Week 1
* Logging in, sending, searching viewing mails using Selenium Webdriver on Gmail
* Taking screenshots

02 June - 08 June: TIER 3/4- Week 2
* Start creating a working model, which when fed an HTML email code can return a screenshot from Gmail

09 June - 15 June: TIER 3/4- Week 3
* Finish creating the working model for TIER 3 and 4
* Perform tests to ensure it's working
16 June - 22 June: TIER 1/2- Week 1
* First Blog Post describing the details of the Midterm Submission and status of the project
* Start working on the model for TIER 1 and 2

23 June: Mid term submission - The working model of TIER 3/4

23 June - 29 June: TIER 1/2- Week 2
* Finish working on model for TIER 1 and 2
* Introduce improvements in the TIER 3-4 model, based on feedback

30 June - 06 July: Integration Week
* Integrate the two models to build the Alpha version
* Second Blog Post describing details of Alpha Version and status of the project

07 July - 13 July: Extend to other webmail clients

14 July - 27 July: Beta version Weeks 1 and 2
* Perform tests to ensure working of Beta Version before presenting
* Present Beta version based on feedback, incorporating other webmail clients
* Third blog post describing details of the Beta Version and status of the project
* Detailed documentation

27 July - 03 Aug: Testing and Bug fixing Week

04 Aug - 10 Aug:
* Buffer Week
* Research on building the GUI and incorporating the project with CiviCRM
* Final Documentation

11 - 17 Aug:
* Wrap up and Buffer Week
* Merging code with CiviCRM-master
* Final Blog Post describing takeaways, accomplishments and findings on what needs to be done before the project can go live on CiviCRM

18 Aug: Final submission

Potential Mentors (optional): totten (maybe bgm or ergonlogic would be interested?), kurund

Which aspect project idea do you see as the most difficult?
The biggest challenge would definitely be implementing the Selenium Webdriver part, as I am fairly new to NodeJS framework.

Which aspect project idea do you see as the easiest?
Once the Selenium Webdriver part is done and tested for bugs, extending the model to different webmail clients would become fairly easy.

Which portion of the project idea will you start with?
I'll start by learning to work with NodeJS and Selenium Webdriver and shall take ample time to, part by part, implement the Selenium Webdriver part of the project.
« Last Edit: March 26, 2015, 01:07:16 pm by utkarshsharma »

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
May 14, 2015, 02:48:02 pm
Hi guys!
This is the first of many posts that I'm going to make regarding the project. This is a rather long one.
I read about AngularJS, NodeJS and Selenium Webdriver online, trying to get the feel of things.
I also had a chat with my mentor Tim and we addressed some doubts I had.
We decided I'm going to use JS to code webdriver and I started looking online as to how JS is used on Webdriver to perform tasks like webpage navigation, filling forms and taking a screenshot. While doing that I came across PhantomJS and CasperJS.

Here's what I've understood:
PhantomJS is a headless (no GUI) webkit browser. It's basically an automated browser. It can do all the things we want Webdriver to. BIG Con: No GUI. Pro: PhantomJS is fast, apparently. (Have a look at the attached image to see how PhantomJS takes a screenshot.)
Another interesting feature of PhantomJS is, it can also be implemented using Webdriver.
Here, I have an idea. We can initially perform tests using Webdriver on Chrome or Firefox and after we get them to work, we can use PhantomJS driven by Webdriver in our project. A problem can be: PhantomJS renders webpages differently than Chrome and Firefox.

CasperJS is a navigation scripting & testing utility (a JS library?) which works on top of PhantomJS and which simplifies PhantomJS scripts. This can also be used in our project. How? That I am going to explore.
Pro: Easier Coding. Con: Still under development. Con: All Phantom JS cons.

Webdriver: Pro: Can do almost everything which a person can do, and on a number of browsers. Con: Is slow. Con: Code becomes large.

Now, should I dig deeper here or should I stick to just Selenium Webdriver? Is PhantomJS/CasperJS really worth the effort?
These links might be helpful:
https://groups.google.com/forum/#!topic/phantomjs/jNDLklGNsIQ
http://stackoverflow.com/questions/14099770/casperjs-phantomjs-vs-selenium

I'm going to spend the next two or three days trying to figure out which of the three- PhantomJS (driven using Webdriver), CasperJS and Selenium Webdriver is best suited for use in our project.

I don't yet have a proper understanding of all these things, so some inputs would be really helpful.

Questions:
1. Can't a website detect that it's a bot doing the job and then ask us to fill a captcha or something and hinder our working?
See point number 5 listed under Disadvantages on http://scraping.pro/using-selenium-webdriver-for-website-scraping/
2. Can we use CiviMail Blast to send the emails to IDs on different webmail clients (which will later be logged into and screenshots will be taken)?
3. ^Which email IDs are we talking about here? Will we make some new ones? How are they going to be managed?

PS Thanks for giving me this opportunity to contribute. I'll try to make it count. :)

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
May 14, 2015, 07:42:27 pm
Quote from: utkarshsharma on May 14, 2015, 02:48:02 pm
I'm going to spend the next two or three days trying to figure out which of the three- PhantomJS (driven using Webdriver), CasperJS and Selenium Webdriver is best suited for use in our project.

I don't yet have a proper understanding of all these things, so some inputs would be really helpful.

In terms of runtime dependencies, it seems like we'd prefer to have PhantomJS as the main browser runtime, but there's a question of what interfaces to use to access it. One could use the PhantomJS-native APIs or the CasperJS APIs... which maybe has the benefit of being thinner, but it doesn't let you swap between browsers (eg the headless PhantomJS and the visible Chrome/Firefox/etc). I think it would make sense to use an API that allows you to swap browsers. Does http://webdriver.io/ fit the bill?

Quote from: utkarshsharma on May 14, 2015, 02:48:02 pm
1. Can't a website detect that it's a bot doing the job and then ask us to fill a captcha or something and hinder our working?
See point number 5 listed under Disadvantages on http://scraping.pro/using-selenium-webdriver-for-website-scraping/

By default, it's probably true that the "User Agent" will be set in a way that's easy to detect. But most frameworks allow you to change the "User Agent" -- if it were an issue.

However, I'm not sure it would be an issue. As a webmail user, I don't see captchas on a day-to-day basis. During the initial signup (for a new gmail account), yes, there's probably a captcha -- and that makes a lot of sense. (One wouldn't want hundreds of fake throw-away accounts getting created.) However, I think a given installation only needs one account, and it can be setup manually.

Quote from: utkarshsharma on May 14, 2015, 02:48:02 pm
2. Can we use CiviMail Blast to send the emails to IDs on different webmail clients (which will later be logged into and screenshots will be taken)?
date=1431640082]
3. ^Which email IDs are we talking about here? Will we make some new ones? How are they going to be managed?

If an organization decides to run the email preview service, there would be an install/setup process. As part of that, perhaps they would edit a config file and put in their email ID and password. When it comes time to prepare a screenshot, the preview-service would:

1. Read the email ID and password from a config file.
2. Send an example email to that email address.
3. Login to the webmail and take a screenshot.
4. Post the screenshot to a web server.

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
May 14, 2015, 10:06:43 pm
Webdriver I/O looks quite like CasperJS, but it can be used for many browsers.
What apparently makes Selenium Webdriver and will also make Webdriver-I/O slow for Chrome and Firefox is the time taken to actually open a browser and the subsequent time taken by the browser to execute the steps (see points 2 to 4 in the Disadvantages section http://scraping.pro/using-selenium-webdriver-for-website-scraping/ ). I don't know if we can use Webdriver-I/O for PhantomJS.
If we can, Webdriver-I/O is perfect for our use.
For now we can do the entire thing on Webdriver-I/O for Chrome and then if we get time, we'll look at PhantomJS and pick what works best for us. How does that sound?
Also, I now feel these tools are quite similar to each other and we might not feel much difference using either. So, we can just pick one and move on. What do you think?

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
May 15, 2015, 02:18:13 pm
Quote from: utkarshsharma on May 14, 2015, 10:06:43 pm
Webdriver I/O looks quite like CasperJS, but it can be used for many browsers.
What apparently makes Selenium Webdriver and will also make Webdriver-I/O slow for Chrome and Firefox is the time taken to actually open a browser and the subsequent time taken by the browser to execute the steps (see points 2 to 4 in the Disadvantages section http://scraping.pro/using-selenium-webdriver-for-website-scraping/ ).

Agree that WebDriver (or Selenium or PhantomJS or CasperJS or basically anything which involves a real rendering engine) incurs overhead. Other tools (like HtmlUnit) have lower overhead. But the overhead (launching a real rendering engine) is central to the goal (capturing realistic screenshots).

It may help to compare a few different use-cases:

  • Screencapture - The goal is produce an image that accurately shows how a screen would render for users. This necessarily requires the overhead of a full rendering engine.
  • Data scraping or webcrawling - The goal is to extract information from a web-page.The visual appearance of the screen doesn't matter, so you can cut out the rendering-engine and use a lightweight browsing library (like HtmlUnit).
  • Integration testing (aka web testing or pageflow testing) - The goal is to ensure that a series of web-pages behave as expected. Here, you'll find a mix of conflicting opinions because there are some hard trade-offs between realism and performance. (Realism is desirable due to quirks in the different JS engines, and performance is desirable when there's a large quantity of tests.)

Quote from: utkarshsharma on May 14, 2015, 10:06:43 pm
I don't know if we can use Webdriver-I/O for PhantomJS.
If we can, Webdriver-I/O is perfect for our use.
For now we can do the entire thing on Webdriver-I/O for Chrome and then if we get time, we'll look at PhantomJS and pick what works best for us. How does that sound?
Also, I now feel these tools are quite similar to each other and we might not feel much difference using either. So, we can just pick one and move on. What do you think?

+1 -- it makes perfect sense to proceed with Chrome for now.

FWIW, Webdriver I/O does work with PhantomJS. Here's a variation on their guide (http://webdriver.io/guide.html):

Code: [Select]
## Make a project folder
mkdir webdriverio-test && cd webdriverio-test

## Download webdriverio and phantomjs
npm install webdriverio phantomjs

## Launch phantomjs
./node_modules/phantomjs/bin/phantomjs --webdriver=4444
# Note: leave this running and open another terminal

## Create "test.js" like in http://webdriver.io/guide.html
## but remove the "browserName: firefox" constraint.

## Run the test
node test.js

utkarshsharma

  • I’m new here
  • *
  • Posts: 22
  • Karma: 0
  • CiviCRM version: 4.5.8
  • CMS version: Drupal 7
  • MySQL version: 5.5.41
  • PHP version: 5.5.9
Re: Email Preview Cluster
May 19, 2015, 03:34:26 am
I have moved forward with using Webdriver I/O and Chrome.
Haven't yet succeeded in filling forms, but I will get around it in a day or two, hopefully.

Also, I was trying to figure out what kind of request are we going to send to the server and how we're going to tell the server to execute the Webdriver script (a js file).

This all may sound pretty naive.
One way can be:
When the user clicks on the 'Preview' button, we can run a script locally which will remotely log in (or connect) to the server and run system commands on the server (as we do locally on the Terminal) to execute the webdriver script present there.
Second way:
When the user clicks on the 'Preview' button we can send a POST request, with a unique ID for our request, appended to the URL. The server can hence identify our request and run the webdriver script to take screenshots and store them on the server itself.

How and in what format are we going to retrieve those pictures now? I guess this is the thing you and Kurund are going to help me with.
I'll be spending the next two-three days trying to figure out all this stuff.

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: Email Preview Cluster
May 19, 2015, 03:56:40 pm
Cool, looking forward to it.

Regarding the format for how to submit/retrieve jobs/screenshots, I spent last Friday on evaluating some design questions and just posted some notes on the wiki:

http://wiki.civicrm.org/confluence/display/CRM/Email+Preview+Infrastructure

Generally, an approach based on POSTing to a REST API (which is based on NodeJS) seems the most appealing for the current version. In the next 24-48hr, I'll aim to post some specs for the REST API on the wiki.

I'm still debating on which mix of libraries/frameworks to try first. Some leading contenders:

  • Restify - http://mcavage.me/node-restify/ - HTTP library oriented toward implementing RESTful APIs
  • Sequelize - http://sequelizejs.com/ - ORM framework / DB abstraction
  • Loopback.io - http://loopback.io/ - Full framework with support for scaffolding new REST APIs with many different data stores

Pages: [1] 2 3
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion »
  • Google Summer of Code »
  • Email Preview Cluster

This forum was archived on 2017-11-26.