In the paper the phenomenon of big data is presented. I pay my special attention to the relation of this phenomenon to research work in experimental sciences. I search for answers to two questions. First, do the research methods proposed within the paradigm big data can be applied in experimental sciences? Second, does applying the research methods subject to the big data paradigm lead, in consequence, to a new understanding of science?
A common observation of everyday life reveals the growing importance of data science methods, which are increasingly more and more important part of the mainstream of knowledge generation process. Digital technologies and their potential for data collection and data processing have initiated the birth of the fourth paradigm of science, based on Big Data. Key to these transformations is datafication and data mining that allow the discovery of knowledge from contaminated data. The main purpose of the considerations presented here is to describe the phenomena that make up these processes and indicate their possible epistemological consequences. It has been assumed that increasing datafication tendencies may result in the formation of a data- centric perception of all aspects of reality, making data and the methods of their processing a kind of higher instance shaping human thinking about the world. This research is theoretical in nature. Such issues as the process of datafication and data science have been analyzed with a focus on the areas that raise doubts about the validity of this form of cognition.
Power big data contains a lot of information related to equipment fault. The analysis and processing of power big data can realize fault diagnosis. This study mainly analyzed the application of association rules in power big data processing. Firstly, the association rules and the Apriori algorithm were introduced. Then, aiming at the shortage of the Apriori algorithm, an IM-Apriori algorithm was designed, and a simulation experiment was carried out. The results showed that the IM-Apriori algorithm had a significant advantage over the Apriori algorithm in the running time. When the number of transactions was 100 000, the running of the IM-Apriori algorithm was 38.42% faster than that of the Apriori algorithm. The IM-Apriori algorithm was little affected by the value of supportmin. Compared with the Extreme Learning Machine (ELM), the IM-Apriori algorithm had better accuracy. The experimental results show the effectiveness of the IM-Apriori algorithm in fault diagnosis, and it can be further promoted and applied in power grid equipment.
The road network development programme, as well as planning and design of transport systems of cities and agglomerations require complex analyses and traffic forecasts. It particularly applies to higher-class roads (motorways and expressways), which in urban areas, support different types of traffic. Usually there is a conflict between the needs of long-distance traffic, in the interest of which higher-class roads run through undeveloped areas, and the needs of bringing such road closer to potential destinations, cities [1]. By recognising the importance of this problem it is necessary to develop the research and methodology of traffic analysis, especially trip models. The current experience shows that agglomeration models are usually simplified in comparison to large city models, what results from misunderstanding of the significance of these movements for the entire model functioning, or the lack of input data. The article presents the INMOP 3 research project results, within the framework of which it was attempted to increase the accuracy of traffic generation in agglomeration model owing to the use of BigData – the mobile operator’s data on SIM card movements in the Warsaw agglomeration.
W obliczu rewolucji technologii informatycznych badacze nauk społecznych mają przed sobą nie lada wyzwanie. Oto bowiem wraz ze zwiększającą się popularnością Internetu pojawiły się ogromne ilości danych zawierających opinie, poglądy i zainteresowania jego użytkowników. Chociaż analiza tych danych stawia przed badaczami poważne problemy metodologiczne, za ich użyciem przemawia fascynujący materiał powstający bez ingerencji badaczy. Dużą część tego materiału stanowią dane z najpopularniejszej na świecie wyszukiwarki Google. Co minutę jej użytkownicy ze wszystkich miejsc na świecie zadają ponad 3 miliony zapytań, które są następnie klasyfikowane i udostępniane za pomocą aktualizowanych na bieżąco narzędzi. W artykule tym omówione są próby adaptacji tych danych do potrzeb nauk społecznych, a także dotychczasowe badania na ten temat. Omówione są także praktyczne aspekty pracy z narzędziami Google’a: Google Trends oraz Google Keyword Planner. Artykuł jest przeznaczony przede wszystkim dla badaczy nauk społecznych zainteresowanych internetowymi źródłami Big Data oraz wykorzystaniem tych danych w pracy naukowej.
With the increasing demand of customisation and high-quality products, it is necessary for
the industries to digitize the processes. Introduction of computers and Internet of things
(IoT) devices, the processes are getting evolved and real time monitoring is got easier.
With better monitoring of the processes, accurate results are being produced and accurate
losses are being identified which in turn helps increasing the productivity. This introduction
of computers and interaction as machines and computers is the latest industrial revolution
known as Industry 4.0, where the organisation has the total control over the entire value chain
of the life cycle of products. But it still remains a mere idea but an achievable one where IoT,
big data, smart manufacturing and cloud-based manufacturing plays an important role. The
difference between 3rd industrial revolution and 4th industrial revolution is that, Industry
4.0 also integrates human in the manufacturing process. The paper discusses about the
different ways to implement the concept and the tools to be used to do the same.
The application of churn prevention represents an important step for mobile communication
companies aiming at increasing customer loyalty. In a machine learning perspective,
Customer Value Management departments require automated methods and processes to
create marketing campaigns able to identify the most appropriate churn prevention approach.
Moving towards a big data-driven environment, a deeper understanding of data
provided by churn processes and client operations is needed. In this context, a procedure
aiming at reducing the number of churners by planning a customized marketing campaign
is deployed through a data-driven approach. Decision Tree methodology is applied to drow
up a list of clients with churn propensity: in this way, customer analysis is detailed, as well
as the development of a marketing campaign, integrating the individual churn model with
viral churn perspective. The first step of the proposed procedure requires the evaluation of
churn probability for each customer, based on the influence of his social links. Then, the
customer profiling is performed considering (a) individual variables, (b) variables describing
customer-company interactions, (c) external variables. The main contribution of this work
is the development of a versatile procedure for viral churn prevention, applying Decision
Tree techniques in the telecommunication sector, and integrating a direct campaign from
the Customer Value Management marketing department to each customer with significant
churn risk. A case study of a mobile communication company is also presented to explain
the proposed procedure, as well as to analyze its real performance and results.