Developing big data applications can easily become a very complex task to handle. Configuring the big data platform, choosing the right technologies, creating an organized code repository consumes too much time. Even all the environment is set properly, building new applications by manual coding requires high technical expertise and a considerable development effort.
To solve these problems, Integer8 provides a super easy integration platform to create Apache Spark applications visually. With the smart development interface, even complex applications can be created with minimal development effort and expertise. That enables our users to focus on their business value instead of technical difficulties.
Besides, Integer8 comes with enterprise features such as code versioning, work-space isolation, variables, schemas and so on. Re-using component blocks help our users to protect their coding standards and keeping their environment organized.
Integer8 can be used by citizen data integration developers, data analysts, data scientists and big data developers to extract value from data and to analyse it. Catalog module provides an interactive SQL query environment for non-technical users who would like to investigate their Hadoop data.
Integer8 can be used in use cases including core ETL development, data warehousing, DWH offloading, stream processing, data discovery and exploration, advanced analytics and data publishing using dashboards.
Integer8 use proven Apache Spark execution engine for processing fast and high volume data on scalable Hadoop platforms.
Integer8 supports Apache Spark 2.1.0 and newer versions.
Yes, each project created in the work-space is isolated in terms of all underlying objects.
Yes, Integer8 leverages YARN for allocating cluster resources. Users can adjust, limit or maximize Integer8 application resources easily.
Integer8 supports executing custom Scala codes to be executed on data-pipelines via Custom Code Component.
Integer8 comes with a simple scheduler for basic purposes. 3rd party scheduler tools are also supported. Integer8 exposes REST api endpoint to control job executions.
Yes, it is possible to set advanced Spark configuration parameters globally for better performance. It is also possible to control degree of parallelism in job level.
Yes, Integer8 provides variables for GLOBAL, PROJECT and JOB levels. Every parameter of a component (such as file path, table names, number of partitions etc) can be standardized through variables.
Besides, Integer8 supports metadata (schema) and connection info (link) sharing between jobs. Each reusable item can be created and managed from one place.
Integer8 supports a wide variety of data source connectors including databases such as Oracle, Microsoft, PostgreSQL; MongoDB no-sql database, stream sources such as Kafka, Azure Eventhubs and filesystems such as AWS S3 and Azure Blob Storage.
We are developing new connectors for next releases. If connector for your use case doesn’t exist please contact us. We can prioritize your request if possible.
Sure, Integer8 supports reading and writing streams through file-systems as Apache Spark does. You can configure your local HDFS or remote file-systems like AWS S3 or Azure Blob Storage for this purpose.
Integer8 uses Apache Spark JDBC connectivity to access remote databases. Number of partitions and number of parallel connections can be customized easily via table parameters.
Integer8 comes with a Twitter Connector for listening tweets from Twitter Public API. You can easily create a Twitter API access key and use it within this connector.
– Deployment Options
Integer8 can be deployed to on-premise Hadoop clusters as well as cloud platforms such as Microsoft Azure and Amazon Web Services. All services for running Integer8 is deployed on an edge node in the Hadoop cluster and all application resources managed with YARN. So, Integer8 doesn’t effect existing applications running on your Hadoop platform.
Integer8 depends on several core Hadoop components such as HDFS, YARN and SPARK2. Contact us and we can together figure out if your existing system is suitable for Integer8 installation.
Yes, you can easily install Integer8 to your Azure HDInsight Spark cluster as an add-on application.
Yes, you can use Integer8 with AWS EC2 instances.