File downloading and Selenium grid

By Grzegorz Szczutkowski | July 10, 2019 | Selenium Grid, Selenium WebDriver

Downloading files from web application is quite common use case. This feature usually is also very important. If those two sentences are valid in your case then you should consider to add automation tests for scenarios covering file downloading feature. When we want to test it locally (without selenium grid) then solution is super simple. If we have Selenium Grid in our test environment then we need more effort to make it work. In this article I will describe how this problem can be solved in java world and Fluentlenium as Selenium wrapper.

The solution for that case is to install Apache server on all machines with Selenium Grid Nodes. Files downloaded by the browser are being saved in the Apache www root folder. When test will be run, it will check on the Selenium Grid Hub what is the IP of the Selenium Grid Node - it will be located by the WebDriver's session id. When new file appear on the Apache server then it will be copied to the machine from which tests are running. When the file is on the machine from which tests are run, we can do with it whatever we want!
The first class we need to achieve that is the class that represents the Apache web page with downloaded files. In the class I have only one method which returns URLs to all files located in the Apache root folder. Note that there should be no default page file in the Apache's URL we are navigating to.

public class ApacheFilesListPage extends FluentPage {

    public List<URL> filesWithExtension(String extension) {
        await().until($("h1")).displayed();
        return $("a[href$='" + extension + "']").stream()
                .map(el -> {
                    try {
                        return new URL(el.attribute("href"));
                    } catch (MalformedURLException e) {
                        throw new RuntimeException("It is not possible to create URL based on "
                            + el.attribute("href"));
                    }
                })
                .collect(Collectors.toList());
    }
}

The next thing we need is an method to navigating that page. In my solution I add it to interface which will be implemented in all classes from which navigating to download page make sense. As you can see I am navigating to that page only when my browser is equal to CHROME or FIREFOX - it means that I am using remote versions of that browsers so I am using Selenium Grid. When browserType() has different value then I am running tests locally and download page is not required. Methods goTo and newInstance are implemented in the Fluentlenium framework. HTTP_ADDRESS_FOR_DOWNLOADS_PAGE is a constant which contains an URL of the Apache web page which can be accessed from the browser. As the server is run on the same machine on which browser is run is usually will be http://localhost.

public interface ViewNavigator {
    void goTo(String url);

    <T> T newInstance(Class<T> cls);

    default Optional<ApacheFilesListPage> goToDownloadsPage() {
        if (Config.config().browserType().equals(BrowserType.CHROME)
                || Config.config().browserType().equals(BrowserType.FIREFOX)) {
            goTo(HTTP_ADDRESS_FOR_DOWNLOADS_PAGE);
            LOGGER.info("Navigate to local download page ({}).", HTTP_ADDRESS_FOR_DOWNLOADS_PAGE);
            return Optional.of(newInstance(ApacheFilesListPage.class));
        } else {
            return Optional.empty();
        }
    }
}

The next important class is used for getting IP address of the Selenium Grid Node currently used in the test. Here we set up HTTP connection to know Selenium Grid Host and as an argument of the request we are passing current Selenium WebDriver session id. In the response we are getting information in the json format. The attribute proxyId contains the Selenium Grid Node IP address connected in the given session id.

public final class GridNodeInfoExtractor {
	private final static String SELENIUM_GRID_HUB_URL = "http://selenum-grid-hub-ip/wd/hub";

    private GridNodeInfoExtractor() {
    }

	    public static URL currentGridNodeLocation(RemoteWebDriver remoteWebDriver) {
        try {
            URL urlWithPrivateIp = currentGridNodeLocation(remoteWebDriver.getSessionId());
            return new URL(urlWithPrivateIp.getProtocol(), urlWithPrivateIp.getHost(),
                    urlWithPrivateIp.getPort(), urlWithPrivateIp.getFile());
        } catch (MalformedURLException e) {
            throw new RuntimeException("Can not create correct selenium grid hub URL.");
        } catch (IOException e) {
            throw new RuntimeException("IOException has been thrown while getting data from Selenium Hub.");
        }
    }

    public static URL currentGridNodeLocation(SessionId session) throws IOException {
        String hubLocation = SELENIUM_GRID_HUB_URL.replace("/wd/hub", "");
        URL url = new URL(hubLocation + "/grid/api/testsession?session=" + session);
        HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
        InputStream is = conn.getInputStream();
        InputStreamReader isr = new InputStreamReader(is);
        BufferedReader br = new BufferedReader(isr);
        StringBuffer response = new StringBuffer();

        String line;
        while ((line = br.readLine()) != null) {
            response.append(line);
        }
        br.close();
        JSONObject jsonResponse = new JSONObject(response.toString());
        String urlAsString = jsonResponse.getString("proxyId");
        if (urlAsString != null) {
            return new URL(jsonResponse.getString("proxyId"));
        } else {
            throw new IllegalStateException("It is not possible to extract instance private ip from hub server.");
        }
    }
}

The goal of the last required class is to extract the newly downloaded file while performing the test case. While creating the object of that class we passing two arguments - second one is the file extension of the file we assume to get while performing the test. The first one is the optional Apache download page. When the download page is empty we assume that test is run locally and downloaded files are also located on the local drive. When ApacheFilesListPage object is passed then we assume that test is run via selenium grid and file should be available on the Apache page.
While creating the object we are creating the initial list of available files with given extension. Performing getNewDownloadedFile() method opens the download page in the new tab and returns the File object copied to the local drive of newly downloaded file.

public class FileExtractor {
    private static final Logger LOGGER = LoggerFactory.getLogger(FileExtractor.class);
	private static final String HTTP_ADDRESS_FOR_DOWNLOADS_PAGE = "http://localhost/";
    private static final int FILE_DOWNLOADED_POOLING_IN_MILLISECONDS = 200;
    private static final int HTTP_PORT = 80;
    private final List<URL> initialFilesList;
    private final String fileExtension;
    private final Optional<ApacheFilesListPage> downloadPage;

    public FileExtractor(Optional<ApacheFilesListPage> from, String fileExtension) {
        this.downloadPage = from;
        this.fileExtension = fileExtension;
        this.initialFilesList = getListOfCurrentFiles();
    }

    public File getNewDownloadedFile() throws IOException {
        List<File> filesList = new ArrayList<>();
        if (downloadPage.isPresent()) {
            String dashboardWindowHandle = downloadPage.get().getDriver().getWindowHandle();
            downloadPage.get().window().openNewAndSwitch();

            waitForNewFilesToBeDownloaded();
            for (URL url : getListOfOnlineFiles(downloadPage.get())) {
                if (!initialFilesList.contains(url)) {
                    String fileName = url.getFile()
                            .replaceAll("\\\\", "")
                            .replaceAll("/", "");
                    File tempFile = new File(fileName);
                    String externalHost = currentGridNodeLocation((RemoteWebDriver) downloadPage.get().getDriver())
                            .getHost();
                    URL externalUrl = new URL(url.getProtocol(), externalHost, HTTP_PORT, url.getFile());
                    try {
                        FileUtils.copyURLToFile(externalUrl, tempFile);
                    } catch (IOException e) {
                        LOGGER.info(e.getMessage());
                        externalUrl = new URL(url.getProtocol(), ec2PrivateIp2PublicIp(externalHost), HTTP_PORT, url.getFile());
                        FileUtils.copyURLToFile(externalUrl, tempFile);
                    }
                    filesList.add(tempFile);
                    LOGGER.info("New file available on server ({}) and it has been downloaded.", externalUrl);
                }
            }
            downloadPage.get().window().switchTo(dashboardWindowHandle);
        } else {
            waitForNewFilesToBeDownloaded();
            for (URL url : getListOfLocalFiles()) {
                if (!initialFilesList.contains(url)) {
                    filesList.add(new File(url.getFile()));
                    LOGGER.info("New file available locally: {}", url);
                }
            }
        }
        if (filesList.size() == 1) {
            return filesList.get(0);
        } else {
            throw new IllegalStateException("Only one new file should be downloaded.");
        }
    }

    private List<URL> getListOfCurrentFiles() {
        if (downloadPage.isPresent()) {
            return getListOfOnlineFiles(downloadPage.get());
        } else {
            return getListOfLocalFiles();
        }
    }

    private List<URL> getListOfOnlineFiles(ApacheFilesListPage page) {
        page.goTo(HTTP_ADDRESS_FOR_DOWNLOADS_PAGE);
        return page.filesWithExtension(fileExtension);
    }

    private List<URL> getListOfLocalFiles() {
        return Stream.of(Objects.requireNonNull(new File(LOCAL_RUN_DOWNLOAD_LOCATION).list()))
                .filter(file -> file.contains(fileExtension))
                .map(FileExtractor::createURL)
                .collect(Collectors.toList());
    }

    private void waitForNewFilesToBeDownloaded() {
        final long timeout = new Date().getTime() + ACTION_TIMEOUT_IN_SECONDS_FOR_FILE_DOWNLOADING * MILLISECONDS_IN_SECOND;
        while (timeout > new Date().getTime()) {
            if (initialFilesList.size() < getListOfCurrentFiles().size()) {
                return;
            }
            try {
                Thread.sleep(FILE_DOWNLOADED_POOLING_IN_MILLISECONDS);
            } catch (InterruptedException e) {
                LOGGER.warn("Exception while waiting... " + e);
                Thread.currentThread().interrupt();
            }
        }
        throw new TimeoutException("Could not find new downloaded file after "
                + ACTION_TIMEOUT_IN_SECONDS_FOR_FILE_DOWNLOADING
                + " seconds.");
    }

    private static URL createURL(String fileName) {
        try {
            return new File(fileName).toURI().toURL();
        } catch (MalformedURLException e) {
            throw new RuntimeException("File name " + fileName + " can not be casted to URL");
        }
    }
}

Enjoy!

File downloading and Selenium grid

Categories

Popular Articles

Tags