For someone who wants to get into the big data industry. What they want to know most is what big data should learn. Today I will share an article about the content system of big data learning.
The data technology system is very complex. The basic technology covers data acquisition, data mining, and NOSQL database. It's not just multi-mode computing, machine learning, and artificial intelligence. There is also data preprocessing, distributed storage, and deep learning. Covers not only multimodal computing but also parallel data warehouse computing. It also includes various technical categories and levels such as visualization. Big data technologies in various fields are quite different. If you want to master big data technology and theory in multiple fields. Usually, short periods are very difficult. It is recommended to start with a practical application area requirement. Deal with one technical point at a time. After having a certain foundation, extend to again. So the learning effect will be much better.
First, the JavaSE Foundation Core
Java introductory syntax. Object-oriented core. Sets and generics. Threading mechanism. Network programming. Project: Customer management system. Examination management system. Banking management system. Process control structure. There are abnormal systems. Reflection system. IO streams. Design patterns. Jdk8/9/10 new features.
Second, Database Key Technology
MySql installation and use. DML. DCL. Stored procedures and functions. JDBC core technology. Custom BaseDAO. SQL language parsing. DDL. The trigger. Indexing and optimization. DBUtils. Database connection pooling.
Third, the Core of Big Data Foundation
Maven. Mysql senior. Hadoop. The Hive. Kafka. Data acquisition platform project. Linux. Shell. Zookeeper + HA. The Flume. HBase.
Fourth, Spark Ecosystem Framework & Big Data High-Salary Selection Project.
The Scala language. The Spark SQL. Kylin. Druid. Sqoop. The Spark of the Core. The Spark kernel. Presto. Metadata management. Enterprise-level integration projects. Offline data warehouse project.
Fifth, Spark Ecosystem Framework & Enterprise Seamless Docking Project
The Spark Streaming. Redis caches the database. GIT & GIT Hub. Online education programs in action. JVM optimization. ElasticSearch. Kibana. Enterprise-level integration projects: Real-time analysis projects. Spark optimization. Scala algorithm & Data structure.
Sixth, Flink Streaming Data Processing Framework
Flink Environment. Flink Window. Flink DataSet. Flink State & Checkpoint. Enterprise-level integration projects: Real-time analysis projects. Flink on Yarn. Flink Watermark. Flink DataStream Enterprise level practical project: Risk control project. CDH number warehouse item.
It doesn't matter which direction the data is being analyzed. Both junior and senior-level require the ability to learn quickly. Learn business logic and industry knowledge. You also need to learn technical tools and analytical frameworks. There is a lot to learn in the field of data analysis. Need you to have a moment to learn the heart.