Day 01
Data Integration scenarios overview
An introduction to Qlik Talend Cloud and its key usage scenarios
ONLINE WEBINAR
Watch the Recorded Session on Best Practices for Data Integration Scenarios
Customer case
How Domino's Pizza is Mastering Data
Getting a single view of customers and global operations from 85,000 data sources
Domino’s Pizza ranks among the world’s top restaurant brands. Domino’s AnyWare is the company’s name for their customers’ ability to order pizzas not just online and through mobile phone apps, but via smartwatches, TVs, car entertainment systems, and even social media platforms. Domino's wanted to integrate information from every channel — 85,000 structured and unstructured data sources in all — to get a single view of its customers and global operations.
Domino's wanted to integrate information from every channel — 85,000 structured and unstructured data sources in all — to get a single view of its customers and global operations.
Unfortunately, the IT architecture in place at Domino’s was preventing them from reaching those goals. Says Dan Djuric, Vice President, Global Infrastructure and Enterprise Information Management, “we had more than 11,000 business users, 35 data scientists, and marketing agencies — and all of them wanted to build their own database.”

According to Djuric, key reasons Domino’s selected Talend were greater freedom to scale with a more agile architecture, open-source flexibility, rapid implementation, cost-effective and understandable licensing, and predictability.
Talend is now our data mobilization platform. Everything that happens in our ecosystem starts with Talend. It captures data, cleanses it, standardizes it, enriches it, stores it, and allows it to be consumed by multiple teams.
— Dan Djuric,
Vice President, Global Infrastructure & Enterprise Information Management, Domino's
Using Talend in conjunction with another off-the-shelf MDM software solution, Domino’s identifies unique customers from millions of order transactions. They have an infrastructure that collects information from all the company’s point of sales systems and 26 supply chain centers, and through all its channels, including text messages, Twitter, Pebble, Android, and Amazon Echo. Data is fed into Domino’s Enterprise Management Framework, where it’s combined with enrichment data from a large number of third party sources, such as the United States Postal Service, as well as geocode information, demographic and competitive information.
  • 17TB of data


    streamlined and standardized data for BI and advanced analytics
  • 85k+ data sources


    integrated, both structured and unstructured
  • 1 data tracker


    centralized data collection from the company’s point of sales systems, 26 supply chain centers, and its channels
theory
Seven data integration and quality scenarios
There are lots of ways to use data in business, and this list shows seven everyday examples
Data integration is the process of combining data from different sources to create a complete, accurate, and up-to-date dataset for BI, data analysis, and other applications and business processes. It includes data replication, ingestion, and transformation to combine different types of data into a standard format to be stored in a target repository such as a data warehouse, data lake or data lakehouse.

There are lots of ways we can use data integration and quality scenarios, but below are seven everyday uses that most companies have. The list is not ordered by importance, market size, or strategic focus. Also, the use cases are not exclusive, and it's common for an organisation to use several at once. Let's look at seven ways to use Qlik and Talend.

01 Database-to-database synchronization

Database-to-database synchronization is the mainstay use case for many of us at Qlik and Talend. The combined functionality offers you tremendous flexibility for whatever problem you're trying to solve. So whether you use basic data loading, real-time replication, or micro-batch updates, we've got you covered. In fact, database-to-database is most commonly used to solve the following issues:
1
Real-time data for reporting and analytics: Replicating data to a separate database or warehouse can allow for faster and more efficient querying and analysis of the data, without impacting the performance of the primary database.
2
Real-time data integration: Replicating data between databases can facilitate data integration between different systems or applications across the organization to ensure that data is consistent and up to date.
3
Legacy modernization: Offloading legacy data to new data stores to reduce OLAP costs and improve query performance.
4
Cloud data movement: Replicating data between on-premises data sources and cloud databases for new ML initiatives.

02 Data warehouse modernization

Second use case focuses on data warehouse modernization, which refers to automating a cloud data warehouse’s design, deployment, and operation. Data warehouse automation offers faster time to market for new data warehouses, improves data quality, and reduces costs associated with manual administration.

Qlik’s secret sauce is the intelligent data pipelines that help organizations scale their data warehousing efforts more efficiently by automatically generating the necessary transformation SQL and pushing it down to the warehouse for execution.

03 Data lake/lakehouse automation

No segment of the data integration market has seen so much change in recent years as the data lake. Consequently, there are many approaches to data lake implementation, and once again, Qlik's combined portfolio can support any architecture. Our data lake automation solutions help you move enterprise data, transform it, and enforce data governance policies to help you build a data lake for your data analytics, machine learning, and AI initiatives — regardless if your lake is based on Hadoop, cloud-object stores, or Databricks.

04 Database-to-streams/streams-to-database

(or other destinations)

Integrating databases with streaming infrastructures like Apache Kafka or Amazon Kinesis can help organizations gain insights from their dynamic data and respond more quickly to changing business conditions. For example, a company might use a database to store customer data, such as their name, address, and purchase history. They could then use a streaming infrastructure like Kafka to process purchase data as it is generated in real time to highlight nefarious behavior such as fraudulent credit card transactions. Qlik's data integration and quality solutions can synchronize database transactions with streams and also source data from streams to route to any destination in virtually any format.

05 Data quality and governance

Accurate data is the lifeblood of any successful initiative that drives organizational excellence. Consequently, data quality is crucial to any business process, especially:
Data analysis: Good-quality data is essential for accurate data analysis.
Customer relationship management: Accurate data helps businesses to better understand their customers and provide superior customer service.
Risk management: High-quality data helps businesses identify risks and take appropriate actions to mitigate them.
Marketing: Correct data helps businesses target their marketing efforts more effectively.
Financial reporting: Precise data helps businesses produce accurate financial reports.

06 API services and workflow

APIs are an intermediary layer that helps companies safely expose their application data and functionality to external third-party developers, business partners, and other company departments to encourage collaboration and drive innovation. Qlik and Talend's portfolio enable you to create and consume your own APIs for scenarios such as:
Driving collaboration: Create organizational APIs as part of a cloud-first strategy.
Delivering innovation: Build new applications that leverage existing data and functionality via APIs.
Controlling access: Publish APIs that control data exchange between multiple parties.
Adopting new architectures: Create "data contracts" as part of a data mesh.
Enabling automation: Automate business processes such as order processing, inventory management, and customer support.
Improving efficiency: Integrate different systems such as CRM, ERP, and e-commerce platforms.
Implementing reverse ETL: Write back KPIs from data warehouse to operational systems.

07 Operational data transformation

Last category in this list is operational data transformation. This converts raw data into formats that can be used by downstream processes such as electronic data exchange, data science, or analytics. Typically, operational data transformation occurs outside the data warehouse or lake, with the final files saved in an object store. An example: converting transactional records into HL7 files, transforming CSV files to Parquet, and converting aggregate data sources into EDI consumable formats. Qlik's data integration and quality solutions contain specialized functionality for many common transformations and will help you rapidly solve the data exchange problem for specific industry formats.
Additional theory
Five approaches to data integration
What it data integration and how to manage this process
There are five different approaches to execute data integration: ETL, ELT, streaming, application integration (API) and data virtualization. To implement these processes, data engineers, architects and developers can either manually code an architecture using SQL or, more often, they set up and manage a data integration tool, which streamlines development and automates the system.
Each of these five approaches continue to evolve with the surrounding ecosystem of the modern data stack. Historically, data warehouses were the main target repositories, so data had to be transformed before loading. This is the classic ETL data pipeline (Extract > Transform > Load) and it’s still appropriate for small datasets that need complex transformations. However, with the rise of Integration Platform as a Service (iPaaS) solutions, larger datasets, data fabric and data mesh architectures, and the need to support real-time analytics and machine learning projects, integration is shifting from ETL to ELT, streaming and API.
An ETL pipeline is a traditional type of data pipeline which converts raw data to match the target system via three steps: extract, transform and load. Data is transformed in a staging area before it is loaded into the target repository (typically a data warehouse). This allows for fast and accurate data analysis in the target system and is most appropriate for small datasets which require complex transformations.

Change data capture (CDC) is a method of ETL and refers to the process or technology for identifying and capturing changes made to a database. These changes can then be applied to another data repository or made available in a format consumable by ETL, EAI, or other types of data integration tools.
HANDs-ON WITH TALEND
Get access to Talend Cloud
Sign up for a Talend Cloud free trial account
Talend Cloud Services are online services that let you access Talend Software features and functions to design, manage, and monitor data integration capabilities. It has an architecture of multiple components, including Talend Studio, Talend Management Control, Talend Cloud Data Preparation and many more.

See the short introductory video about Talend Cloud capabilities (4,5 minutes):
For now, all you need is access to the Talend Cloud, where we explore platform capabilities with daily small hands-on practice tasks during Data Integration Week.
To start your Talend Cloud trial follow these steps:
1
Visit talend.com website.
2
Click on "Free trial" button on the first screen.
3
In the first drop-down menu choose "Europe on AWS" Region.
4
Fill in the registration form and enter your real email address. Submit your details.
5
You will receive an email with a link to activate the trial within a few minutes. Check Spam folder if you are not getting an email.
6
Follow the link in the email and complete your registration finalizing the details in Talend Cloud.
After the initial setup and a short introductory video, Talend will ask you to choose one of the ways you want to interact with the product: as Analyst, ad Developer and as Self-Explorer.

Since our programme will give you tasks to complete, choose the third option and launch Talend Cloud! You will be able to come back to those materials if you need them via upper menu.
Great! Now you have access to Talend Cloud.
HANDs-ON WITH TALEND
Implementing no-code ETL in Talend Cloud
Watch this tutorial video and repeat the steps in this short video guide (6:14) with your dataset:
DEEP DIVE
Related Links
Learn how organizations can achieve a 3-year ROI of 355% and payback in less than six months.
What to look for in a Data Integration solution.
Learn how you can bring value to your business by bringing together all types of data.
Staying in touch with Qlik Data Integration and Analytics
subscribe to biweekly LinkedIn Newsletter "Data Matters" with news, events and cases from CEE market