'분류 전체보기' 카테고리의 글 목록 (8 Page)

분류 전체보기 +1144

Loading..python + elasticsearch : csv => bulk json 변환
2019.10.23

뷰어로 보기
Loading..elasticsearch SSL 적용 connect code + python
2019.10.22

뷰어로 보기
Loading..logstash_01 / json
2019.10.19

뷰어로 보기
Loading..네이버 기사 크롤링 => elasticsearch 적재
2019.07.12

뷰어로 보기
Loading..제주도 여행 1일
2019.06.13

뷰어로 보기
Loading..elasticsearch + java _api + match_all
2019.06.04

뷰어로 보기
Loading..naver music 크롤링 + elasticsearch
2019.05.22

뷰어로 보기
Loading..elasticsearch : count
2019.05.21

뷰어로 보기
Loading..네이버 풀이
2019.05.18

뷰어로 보기
Loading..elasticsearch + shell01
2019.05.11

뷰어로 보기

python + elasticsearch : csv => bulk json 변환

ELK/elasticsearch2019. 10. 23. 10:04

뷰어
댓글로
이전글
다음글

    ## csv => bulk type의 파일로 변환
    def csvFileRead(self):

        with open(self._csv_path, "r", encoding="utf-8") as csv_file:

            with open("/home/elastic/test.dir/py_test/the_planet_code/the_planet.json", "a", encoding="utf-8") as f:

                csv_reader = csv.DictReader(csv_file, delimiter=",")

                for c, r in enumerate(csv_reader):

                    h = json.dumps({"index": {"_index": "the_planet_poc_index_1", "_type": "_doc", "_id": "the_planet_" + str(c+1)}},
                                  ensure_ascii=False)
                    f.write(h + "\n")

                    e = json.dumps((dict(r)),
                                   ensure_ascii=False)
                    f.write(e + "\n")
                    time.sleep(2)

                f.close()
            csv_file.close()

'ELK > elasticsearch' 카테고리의 다른 글

Elasticsearch + python + pipeline (0)	2019.12.02
네이버 무비 정보 크롤링 => logstash data pipe_line => kibana 시각화 (0)	2019.11.10
elasticsearch SSL 적용 connect code + python (0)	2019.10.22
logstash_01 / json (0)	2019.10.19
elasticsearch + java _api + match_all (0)	2019.06.04

elasticsearch SSL 적용 connect code + python

ELK/elasticsearch2019. 10. 22. 20:39

뷰어
댓글로
이전글
다음글

    @classmethod
    def getElaNode(cls):

        try:

            f = open("/home/elastic/test.dir/py_test/conf.yml", "r", encoding="utf-8")
        except FileNotFoundError as E:
            print (E)
            exit(1)
        else:
            INF = yaml.load(f, yaml.Loader)

            retNode = Elasticsearch(
                    host      = INF.get("ela_host"),
                    port      = 9200,
                    http_auth = (INF.get("user"),
                                 INF.get("psswd"))
                    )

            if retNode.ping():
                #print (retNode.info())
                return retNode

'ELK > elasticsearch' 카테고리의 다른 글

네이버 무비 정보 크롤링 => logstash data pipe_line => kibana 시각화 (0)	2019.11.10
python + elasticsearch : csv => bulk json 변환 (0)	2019.10.23
logstash_01 / json (0)	2019.10.19
elasticsearch + java _api + match_all (0)	2019.06.04
elasticsearch : count (0)	2019.05.21

logstash_01 / json

ELK/elasticsearch2019. 10. 19. 13:36

뷰어
댓글로
이전글
다음글

input {
  stdin {}
}

filter {
  json {
    source => "message"
  }
  mutate {
    remove_field => ["path", "@version", "message", "host", "@timestamp"]
  }
}

output {
  stdout {
    codec => rubydebug { }
  }
  elasticsearch {
    hosts => "192.168.240.183:9200"
    index => "kim"
    document_type => "_doc"
  }
}

'ELK > elasticsearch' 카테고리의 다른 글

python + elasticsearch : csv => bulk json 변환 (0)	2019.10.23
elasticsearch SSL 적용 connect code + python (0)	2019.10.22
elasticsearch + java _api + match_all (0)	2019.06.04
elasticsearch : count (0)	2019.05.21
elasticsearch + shell01 (0)	2019.05.11

네이버 기사 크롤링 => elasticsearch 적재

언어/python2019. 7. 12. 16:25

뷰어
댓글로
이전글
다음글

##
# date
##
from elasticsearch import Elasticsearch
import requests as req
from selenium.webdriver import chrome
from urllib.parse import urlencode
import json
import yaml

from bs4 import BeautifulSoup

from time import localtime
import time
import os
import sys

# ========================================


class NaverNews:

    def __init__(self):

        self.elastiClient = NaverNews.elaInformation()

        urlSettingObj = NaverNews.urlRequestSetting()

        ## url 정보
        self.reqUrl = urlSettingObj.get("url")

        ## ex) fromat => 20190712
        self.currTimeObj = NaverNews.getCurrTime()

        self.urlInfo = {
            "etcParams": urlSettingObj.get("etcParam"),
            "page": None
        }

    """ Title 에 내가 원하는 단어가 있니??
    """
    def isTrue(self):

        ## ====================================================================
        # 경로 이동
        ## ====================================================================
        os.chdir(r"C:\Users\ezfarm\PycharmProjects\ElasticSearchProj\htmlObj")

        htmlPath = r"C:\Users\ezfarm\PycharmProjects\ElasticSearchProj\htmlObj"

        for htmlF in os.listdir():
            abstractPath = os.path.abspath(htmlF)
            print (abstractPath)

            ### =============================
            # html file read
            ### =============================
            try:

                 htmlFileRead = open(abstractPath)

            except FileNotFoundError as e:
                print (e)
                pass
            else:
                ### =============================
                # html file read
                ### =============================
                bsObject = BeautifulSoup(htmlFileRead,"html.parser")

                HEAD_LINE = bsObject.select("ul.type06_headline > li")

                for h in HEAD_LINE:
                    try:

                        headline = h.select("dl > dt")[1]
                    except IndexError as e:

                        try:

                            headline = h.select_one("dl > dt")
                        except:
                            print ("요청 error")
                            pass
                        else:
                            responseObj = self.textPreprocessing(headline.a.string)

                            if responseObj["isTrue"]:
                                self.elasticInsertDocuments(responseObj["title"],
                                                            h.select_one("dl > dd > span.lede").string)
                    else:
                        responseObj = self.textPreprocessing(headline.a.string)

                        if responseObj["isTrue"]:
                            self.elasticInsertDocuments(responseObj["title"],
                                                        h.select_one("dl > dd > span.lede").string)


    def textPreprocessing(self, txt):
        tmp = str(txt).strip().replace("\n", "")
        mark = {"title": tmp, "isTrue": False}

        for i in ["김정은", "이명박", "미사일"]:

            if i in tmp:
                mark["isTrue"] = True
                break

        return mark

    def elasticInsertDocuments(self, title, hObject):

        documents = {
            "title"   : title,
            "context" : hObject,
            "cllctdt" : self.currTimeObj
        }

        try:

            self.elastiClient.index (
                index    ="naver_headline_index",   # 적재할 index
                doc_type ="doc",
                body     =documents
            )
        except:
            print ("적재 실패 !!!")
            pass
        else:
            time.sleep(1.2)
            print("elasticsearch insert success !!!")
            print (documents)

    def doRequests(self):

        for n, p in enumerate(range(1, 95)):
            self.urlInfo["page"] = str(p)
            """
            mode=LSD&mid=sec&sid1=100&date=20190712&page=7
            """
            paramsEtc = self.urlInfo["etcParams"]  + "&" + \
                        "date=" + self.currTimeObj + "&" + \
                        "page=" + self.urlInfo["page"]

            requestUrl = self.reqUrl + "?" + paramsEtc

            try:

                html = req.get(requestUrl)
            except req.exceptions.RequestException as e:
                print (e)
                sys.exit(1)
            else:
                # print ("{} page 작업 중 ...".format(n+1))
                # bsObject = BeautifulSoup(html.text, "html.parser")
                htmlName = "html_file_{}.html".format(str(n+1))
                htmlFile = open(r"C:\Users\ezfarm\PycharmProjects\ElasticSearchProj\htmlObj\{}".format(htmlName),
                                "w")

                try:

                    htmlFile.write(html.text)
                except:
                    print ("html file write error")
                    pass
                else:
                    print ("{} 번째 데이터 파일 write success !!!".format(n+1))
                    htmlFile.close()

    """ reuqest setting 
    """

    @classmethod
    def urlRequestSetting(cls):

        try:

            f = open(r"C:\Users\ezfarm\PycharmProjects\ElasticSearchProj\conf\url.yml", "r", encoding="utf-8")

        except FileNotFoundError as e:
            print(e)
            sys.exit(1)
        else:
            yDoc = yaml.load(f, Loader=yaml.Loader)
            f.close()  # memory 해제
            return yDoc


    """ 검색 날짜 설정 
    """
    @classmethod
    def getCurrTime(cls):

        currObjTime = time.strftime("%Y%m%d", localtime())
        return currObjTime


    """ elasticsearch server가 살아 있는지 확인 
    """
    @classmethod
    def isAliveElastic(cls, elaAddress):

        try:

            req.get("http://" + elaAddress + ":9200")

        except req.exceptions.RequestException as e:
            """ server is die !!
            """
            print(e)
            sys.exit(1)
        else:
            print("elasticsearch server is alive !!!")
            return Elasticsearch(host=elaAddress)


    """ elasticsearch server address 정보 return
    """
    @classmethod
    def elaInformation(cls):

        path = r"C:\Users\ezfarm\PycharmProjects\ElasticSearchProj\conf\elainfo.json"

        try:

            f = open(path, "r", encoding="utf-8")
        except:
            sys.exit(1)
        else:
            jsonDoc = json.load(f)
            f.close()
            elasticNode = NaverNews.isAliveElastic(jsonDoc.get("ela"))
            return elasticNode


def main():
    elanode = NaverNews()
    elanode.isTrue()
    #elanode.doRequests()

if __name__ == "__main__":
    main()

'언어 > python' 카테고리의 다른 글

백준 2108 (0)	2019.12.08
from csv to json convert + logstash (0)	2019.11.26
naver music 크롤링 + elasticsearch (0)	2019.05.22
네이버 뉴스 크롤링 + 형태소 (0)	2019.05.01
페이스북 - python (0)	2019.04.24

제주도 여행 1일

여행/제주도2019. 6. 13. 01:56

뷰어
댓글로
이전글
다음글

elasticsearch + java _api + match_all

ELK/elasticsearch2019. 6. 4. 14:52

뷰어
댓글로
이전글
다음글

package com.jh.ela;
import java.io.IOException;

import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import static org.elasticsearch.index.query.QueryBuilders.*;

/**
 * Builder pattern 
 * @author ezfarm
 *
 */
import com.jh.server.Server;

public class Srcher { 
	
	String srchwrd ;             // 검색어 
	Server srvernode;
	RestHighLevelClient elanode;
	String searchIndex;  
	int size;
	int from;
	
	static public class Builder {
		
		String srchwrd   = null;
		Server srvernode = null;
		RestHighLevelClient elanode = null;
		String searchIndex = null;
		int size = 0;
		int from = 0;
		
		public Builder(String srchwrd) {
			// TODO Auto-generated constructor stub
			// 검색어 
			this.srchwrd = srchwrd;
		}
		
		/**
		 * 페이지 설정 
		 * @param size, from 
		 * @return
		 */
		public Builder withPages (int size, int from) {
			
			// server 
			this.srvernode   = new Server();
			// 검색 대상 인덱스 
			this.searchIndex = this.srvernode.searchIndex;
			this.size = size;
			this.from = from;
			
			return this;
		}
		
		public Srcher build() {
			return new Srcher(this);
		}
	}
	
	public Srcher(Builder builder) {
		
		srchwrd     = builder.srchwrd;
		srvernode   = builder.srvernode;
		elanode     = builder.srvernode.getElaserver();
		searchIndex = builder.searchIndex;
		
	}
	
	/**
	 * Client node close !!!
	 */
	public void serverDie() {
		
		try {
			
			elanode.close();
			System.out.println("client close !!!");
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}
	// 검색 테스트
	public void fullTextSearch() {
		
		/*
		 *  "query" : {
		 *  	"size" : 180,
		 *  	"match_all": {}
		 *  }
		 */
		System.out.println("searchIndex =>  " + searchIndex);
		SearchRequest searchRequest = new SearchRequest(searchIndex);
		SearchResponse searchResponse = new SearchResponse();
		
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		searchSourceBuilder.query(matchAllQuery()); 
		searchSourceBuilder.size(180); 
		searchRequest.source(searchSourceBuilder);
		
		
		if (elanode != null) {
			System.out.println("RestHighLevel node success!!!");
		}
		
		try {
			
			searchResponse = elanode.search(searchRequest, RequestOptions.DEFAULT);
			System.out.println("status =>  " + searchResponse.status());
			SearchHits hits = searchResponse.getHits();
			
			for (SearchHit hit : hits) {
				System.out.println(hit.getSourceAsString());
			}
			
		} catch (ElasticsearchException e) {
			System.out.println(e);
			
		} catch (IOException e) {
			// TODO Auto-generated catch block
			System.out.println(e);
		} 
	}
}

'ELK > elasticsearch' 카테고리의 다른 글

elasticsearch SSL 적용 connect code + python (0)	2019.10.22
logstash_01 / json (0)	2019.10.19
elasticsearch : count (0)	2019.05.21
elasticsearch + shell01 (0)	2019.05.11
elastic 불용어 테스트 (0)	2019.05.01

naver music 크롤링 + elasticsearch

언어/python2019. 5. 22. 06:30

뷰어
댓글로
이전글
다음글

from time import localtime, strftime
from bs4 import BeautifulSoup
import requests
import json

from Ela.Elast import Elarv

class NMusic:

    def __init__(self):
        self.url = NMusic.getInformation()

    def getUrl(self):
        html = requests.get(self.url)
        if html.status_code == 200:
            bsObject = BeautifulSoup(html.text, "html.parser")
            print("title : {}".format(bsObject.title.string))
            top100 = bsObject.select_one("table.home_top100 > tbody")

            for r in range(1, 11):

                lst = top100.select_one("tr._tracklist_move._track_dsc.list{rank}".format(rank=r))
                # --------------------------------------------------------------
                rnk     = lst.select_one("td.ranking > span.num")               # - 순위
                nme     = lst.select_one("td.name > span.m_ell > a")            # - 곡명
                artist  = lst.select_one("td._artist > span.m_ell > a._artist") # - 뮤지션
                insrtDay= strftime("%Y%m%d", localtime())                       # - 삽입 년도

                d = {"rank" : rnk.string,
                     "name" : nme.string,
                     "artist" : artist.string,
                     "insertdate" : insrtDay}

                Elarv.insertDocuments(d)
                print ("적재 성공 !!!")
                # print (insrtDay)
                # --------------------------------------------------------------
                #print ("{ranking} => {songname} : {artist}".format(ranking = rnk.string, songname = nme.string, artist = artist.string))

    @classmethod
    def getInformation(cls):

        try:

            f = open(r"C:\Users\junhyeon.kim\Desktop\StuEla\clw\info.json", "r", encoding="utf-8")
        except FileNotFoundError as e:
            print (e)
        else:
            jsonDoc = dict(json.load(f)).get("url")
            f.close()
            return jsonDoc

def main():
    m = NMusic() # 객체 생성
    m.getUrl()

if __name__ == "__main__":
    main()

from elasticsearch import Elasticsearch

class Elarv:

    @classmethod
    def insertDocuments(cls, elements):
        el = Elasticsearch(hosts="192.168.240.10")
        el.index(index="nmusic", doc_type="doc", body=elements)

def main():
    enode = Elarv()

if __name__ == "__main__":
    main()

'언어 > python' 카테고리의 다른 글

from csv to json convert + logstash (0)	2019.11.26
네이버 기사 크롤링 => elasticsearch 적재 (0)	2019.07.12
네이버 뉴스 크롤링 + 형태소 (0)	2019.05.01
페이스북 - python (0)	2019.04.24
python + outlook (0)	2019.03.31

elasticsearch : count

ELK/elasticsearch2019. 5. 21. 06:07

뷰어
댓글로
이전글
다음글

#!/bin/bash

curl -XGET 'http://192.168.240.10:9200/movie-index/_count?pretty' -H 'Content-Type: application/json' -d '
{
  "query" : {
    "term" : { "movie_jangre" : "드라마" }
  }
}'

'ELK > elasticsearch' 카테고리의 다른 글

logstash_01 / json (0)	2019.10.19
elasticsearch + java _api + match_all (0)	2019.06.04
elasticsearch + shell01 (0)	2019.05.11
elastic 불용어 테스트 (0)	2019.05.01
python + bulk + insert + code (0)	2019.04.24

네이버 풀이

언어/c언어2019. 5. 18. 08:07

뷰어
댓글로
이전글
다음글

https://kin.naver.com/qna/detail.nhn?d1id=1&dirId=1040101&docId=327353769&mode=answer

# include <stdio.h>
# include <stdlib.h>
# include <math.h>
# define ERROR_ 1
typedef struct Triangle 
{
	double bottomLine; // 밑변
	double heightLine; // 높이
	double longLine;
}Tri;

// 동적할당 및 데이터 초기화 ___________
void _init_(Tri** tparam);

// 메모리 해제 ________________________
void _memoryFree_(Tri** tparam);

// 밑변, 높이 입력_____________________
void _numberInput_(Tri** tparam);

// 빗변의 길이 출력____________________
void _longLinePrintf_(Tri** tparam);

int main(void) 
{
	Tri* tnode = NULL;
	_init_(&tnode);
	_numberInput_(&tnode);
	_longLinePrintf_(&tnode);
	_memoryFree_(&tnode);
	return 0;
} // end of main function 

// 동적할당 및 데이터 초기화 ___________
void _init_(Tri** tparam)
{
	(*tparam) = (Tri*)malloc(sizeof(Tri));
	
	if ((*tparam) == NULL) 
	{
		printf("malloc error");
		exit(ERROR_);
	}
	else // (*tparam) != NULL
	{
		(*tparam)->bottomLine = 0.0;
		(*tparam)->heightLine = 0.0;
		(*tparam)->longLine = 0.0;
	}
} // end of _init_ function 

// 메모리 해제 ________________________
void _memoryFree_(Tri** tparam)
{
	free((*tparam));
} // end of _memoryFree_ function 

// 밑변, 높이 입력_____________________
void _numberInput_(Tri** tparam)
{
	printf("밑변? ");
	scanf_s("%lf", &(**tparam).bottomLine);

	printf("높이? ");
	scanf_s("%lf", &(**tparam).heightLine);
}

// 빗변의 길이 출력____________________
void _longLinePrintf_(Tri** tparam)
{
	double c =
		((**tparam).bottomLine * (**tparam).bottomLine) +
		((**tparam).heightLine * (**tparam).heightLine);
	
	(** tparam).longLine = sqrt(c);
	printf("빗변의 길이: %lf\n", (** tparam).longLine);
}

'언어 > c언어' 카테고리의 다른 글

c언어 linkedlist (0)	2019.12.31
c언어 네이버 풀이 중첩 for문을 사용해서 3을 입력하면 (0)	2019.05.06
네이버 풀이 (0)	2018.12.01
네이버 문제 풀이 - 최대공약수 (0)	2018.11.28
네이버 지식이 풀이 (0)	2018.11.28

elasticsearch + shell01

ELK/elasticsearch2019. 5. 11. 11:59

뷰어
댓글로
이전글
다음글

#!/bin/bash 

index_name=$1 

if [ "$index_name" != "" ];then 
  curl -X GET "http://192.168.240.10:9200/{$index_name}/_count?pretty" 
fi

실행

$ ./index-count.sh kimjh
{
  "count" : 17,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  }
}

#!/bin/bash

echo "==================================================="
echo " (주) "
echo " 날짜 : `date "+%Y-%m-%d"`"
echo " 작성자 : 김준현/ 매니져"
echo "==================================================="
response=$(curl -XGET 'http://192.168.240.10:9200/_cat/indices/test*?pretty')
echo $response
echo "==================================================="

'ELK > elasticsearch' 카테고리의 다른 글

elasticsearch + java _api + match_all (0)	2019.06.04
elasticsearch : count (0)	2019.05.21
elastic 불용어 테스트 (0)	2019.05.01
python + bulk + insert + code (0)	2019.04.24
elastic bacis#3 (0)	2019.04.23

‹ Prev 1 ··· 5 6 7 8 9 10 11 ··· 115 Next ›

최근에 올라온 글

최근에 달린 댓글

공지사항

글 보관함

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

링크

나니시큐리티

total :

today :

yesterday :

'ELK > elasticsearch' 카테고리의 다른 글

'ELK > elasticsearch' 카테고리의 다른 글

'ELK > elasticsearch' 카테고리의 다른 글

'언어 > python' 카테고리의 다른 글

'ELK > elasticsearch' 카테고리의 다른 글

'언어 > python' 카테고리의 다른 글

'ELK > elasticsearch' 카테고리의 다른 글

'언어 > c언어' 카테고리의 다른 글

'ELK > elasticsearch' 카테고리의 다른 글

최근에 올라온 글

최근에 달린 댓글

공지사항

글 보관함

링크

티스토리툴바